ARM Data Preparation
First, calculate the saving goals rate(rate_saved
) from clean_laliga_playerDF, and select variables of season
, teamName
, position
, game_minutes
, rate_saved
, and rating
to generate a new dataframe. Second, calculate the team’s points for each season(PTS
) from the clean_laliga_teamDF CSV file and extract the columns of season
, teamName
, and PTS
to generate a new dataframe. Then, using these two new datafram and merge them with teamName and season as ID. Next, the data of selecting only the goalkeeper’s position and picking the goalkeeper’s playing time is greater than 95 minutes (90 minutes is 1 game, so less than 95 has no reference significance, and it may directly exclude). After that, generating a new dataframe prepare for the next step. (As shown in the following figure, the first stage of data preparation is done)
By following the above, based on the first stage, select variables of game_minutes
, rating
, rate_saved
, and PTS
to generate a new dataframe. Since these data are numeric data, so they need to be converted to transaction data.
The conversion rules are as follows:
Converting the dataset to transaction data by following the rules above. Saving this dataframe in CSV format for exploring association rules.