Data Enrichment and Derived Tables

Data Enrichment and Derived Tables#

The basic dataframes provide a good start for analysis, but there are a few gaps in the data.

For example, the withdrawals data does not directly provide information regarding what class a withdrawal is associated with.

To find the class, we need to merge in data from the clazz dataset.

from dakar_rallydj.getter import DakarAPIClient

dakar = DakarAPIClient(
    use_cache=True,
    backend='memory',
    # cache_name='dakar_cache',
    expire_after=3600  # Cache for 1 hour
)

Enriching the Withdrawal Data#

Let’s enrich the base withdrawal dataframe by adding in a column that identifies the vehicle class. To do this, we can use the

withdrawals_df, withdrawn_competitors_df, withdrawn_teams_df = dakar.get_withdrawals()

withdrawn_teams_df.head()
team.bib team.brand team.model team.vehicle team.vehicleImg team.clazz team.w2rc
0 202 MINI JCW RALLY 3.0I X-RAID MINI JCW TEAM https://img.aso.fr/core_app/img-motorSports-da... 96c0869600e0013dbf5f86f60e5c4da4 False
1 205 TOYOTA HILUX IMT EVO TOYOTA GAZOO RACING https://img.aso.fr/core_app/img-motorSports-da... 96c0869600e0013dbf5f86f60e5c4da4 False
2 206 TOYOTA HILUX IMT EVO TOYOTA GAZOO RACING https://img.aso.fr/core_app/img-motorSports-da... 96c0869600e0013dbf5f86f60e5c4da4 False
3 208 TOYOTA HILUX GURTAM TOYOTA GAZOO RACING BALTICS https://img.aso.fr/core_app/img-motorSports-da... 96c0869600e0013dbf5f86f60e5c4da4 False
4 213 MD OPTIMUS MD RALLYE SPORT https://img.aso.fr/core_app/img-motorSports-da... f00d7ec8d2d96e9cf11aa515109376cf False
clazz_df = dakar.get_clazz()
clazz_df.head()
refueling promotionalDisplay reference label position shortLabel _bind _id _parent $group color tinyLabel ar en es fr category categoryClazz
0 0 True 2025-A-T1-+ + 4 cat.name.A_T1_+ allClazz-2025-A 96c0869600e0013dbf5f86f60e5c4da4 categoryGroup-2025-A:b49155b3f5670d2a907aa01e3... categoryGroup-2025-A:b49155b3f5670d2a907aa01e3... NaN NaN T1+: Prototype Cross-Country Cars 4x4 T1+: Prototype Cross-Country Cars 4x4 T1+: Prototype Cross-Country Cars 4x4 T1+ : Voitures Tout-terrain Prototypes 4x4 A 2025-A-T1
1 0 True 2025-A-T1-1 1 0 cat.name.A_T1_1 allClazz-2025-A f666973e89db183ecfefc75c3af8ffb1 categoryGroup-2025-A:b49155b3f5670d2a907aa01e3... categoryGroup-2025-A:b49155b3f5670d2a907aa01e3... NaN NaN T1.1 Prototype Cross-Country Cars 4x4 T1.1 Prototype Cross-Country Cars 4x4 T1.1 Prototype Cross-Country Cars 4x4 T1.1 : Voitures Tout-terrain Prototypes 4x4 A 2025-A-T1
2 0 True 2025-A-T1-2 2 1 cat.name.A_T1_2 allClazz-2025-A f00d7ec8d2d96e9cf11aa515109376cf categoryGroup-2025-A:b49155b3f5670d2a907aa01e3... categoryGroup-2025-A:b49155b3f5670d2a907aa01e3... NaN NaN T1.2 Prototype Cross-Country Cars 4x2 T1.2 Prototype Cross-Country Cars 4x2 T1.2 Prototype Cross-Country Cars 4x2 T1.2 : Voitures Tout-terrain Prototypes 4x2 A 2025-A-T1
3 0 True 2025-A-T1-3 3 2 cat.name.A_T1_3 allClazz-2025-A f071b5dbfd586a4ba46100196a98a9c4 categoryGroup-2025-A:b49155b3f5670d2a907aa01e3... categoryGroup-2025-A:b49155b3f5670d2a907aa01e3... NaN NaN T1.3 FIA: النتيجة T1.3: SCORE T1.3 : SCORE T1.3 : SCORE A 2025-A-T1
4 0 True 2025-A-T1-U U 3 cat.name.A_T1_U allClazz-2025-A 1501ebcbaf3ad27e72aecfba7faa8037 categoryGroup-2025-A:b49155b3f5670d2a907aa01e3... categoryGroup-2025-A:b49155b3f5670d2a907aa01e3... NaN NaN T1.U: "Ultimate" Prototype Cross-Country Cars T1.U: "Ultimate" Prototype Cross-Country Cars T1.U: "Ultimate" Prototype Cross-Country Cars T1.U : Voitures Tout-Terrain Prototypes "Ultim... A 2025-A-T1
import pandas as pd

clazz_map = pd.merge(withdrawn_teams_df[["team.bib", "team.clazz"]], clazz_df[[
                     "_id", "reference", "categoryClazz", "en"]], left_on="team.clazz", right_on="_id").drop(columns=["team.clazz", "_id"]).rename(columns={"en": "clazz_label"})

clazz_map.head()
team.bib reference categoryClazz clazz_label
0 202 2025-A-T1-+ 2025-A-T1 T1+: Prototype Cross-Country Cars 4x4
1 205 2025-A-T1-+ 2025-A-T1 T1+: Prototype Cross-Country Cars 4x4
2 206 2025-A-T1-+ 2025-A-T1 T1+: Prototype Cross-Country Cars 4x4
3 208 2025-A-T1-+ 2025-A-T1 T1+: Prototype Cross-Country Cars 4x4
4 213 2025-A-T1-2 2025-A-T1 T1.2 Prototype Cross-Country Cars 4x2
clazz_map['reference'].unique()
array(['2025-A-T1-+', '2025-A-T1-2', '2025-A-T1-1', '2025-A-T3-1',
       '2025-A-T4-SSV1', '2025-A-T4-T4', '2025-A-T5-1', '2025-A-T5-2'],
      dtype=object)

We can also pull in further information from the groups data.

groups_df = dakar.get_groups()
groups_df.head()
position shortLabel reference label tinyLabel promotionalDisplay _bind _origin _id _parent color ar en es fr
8 0 cat.name.A_T1 2025-A-T1 T1 ULT True allGroups-2025 categoryGroup-2025-A b49155b3f5670d2a907aa01e319876b8 category-2025:63b4f5da4591200d0a4cc239245eb03a #EBBC4E Ultimate Ultimate Ultimate Ultimate
7 1 cat.name.A_T2 2025-A-T2 T2 STK True allGroups-2025 categoryGroup-2025-A 4dac064bf100bc806b91e7f2e7758297 category-2025:63b4f5da4591200d0a4cc239245eb03a #C7C9C7 Stock Stock Stock Stock
5 4 cat.name.A_T3 2025-A-T3 T3 CHG True allGroups-2025 categoryGroup-2025-A 15f329900afa29e3e6b099ae681ebe12 category-2025:63b4f5da4591200d0a4cc239245eb03a #E04E39 Challenger Challenger Challenger Challenger
6 5 cat.name.A_T4 2025-A-T4 T4 SSV True allGroups-2025 categoryGroup-2025-A 423ea731fdcba5cda62c8334985889b0 category-2025:63b4f5da4591200d0a4cc239245eb03a #A7C6ED SSV SSV SSV SSV
9 6 cat.name.A_T5 2025-A-T5 T5 TRK True allGroups-2025 categoryGroup-2025-A f1a437ac1135c9d9a5e33f5096f95259 category-2025:63b4f5da4591200d0a4cc239245eb03a #2D2926 شاحنة Truck Camión Camion

For convenience, let’s merge some of that in:

clazz_map = pd.merge(clazz_map, groups_df[["reference", "tinyLabel", "label", "color", "en"]].rename(
    columns={"reference":"categoryClazz"}), on="categoryClazz").rename(columns={"en": "group_label"})

# Sort and reindex
# Or should we leave the original index so that
# we can readily merge on with it based on index?
clazz_map.sort_values("team.bib", inplace=True)
clazz_map.reset_index(drop=True, inplace=True)

clazz_map.head()
team.bib reference categoryClazz clazz_label tinyLabel label color group_label
0 202 2025-A-T1-+ 2025-A-T1 T1+: Prototype Cross-Country Cars 4x4 ULT T1 #EBBC4E Ultimate
1 205 2025-A-T1-+ 2025-A-T1 T1+: Prototype Cross-Country Cars 4x4 ULT T1 #EBBC4E Ultimate
2 206 2025-A-T1-+ 2025-A-T1 T1+: Prototype Cross-Country Cars 4x4 ULT T1 #EBBC4E Ultimate
3 208 2025-A-T1-+ 2025-A-T1 T1+: Prototype Cross-Country Cars 4x4 ULT T1 #EBBC4E Ultimate
4 213 2025-A-T1-2 2025-A-T1 T1.2 Prototype Cross-Country Cars 4x2 ULT T1 #EBBC4E Ultimate