Markov method

Functions to compute rating and ranking using Markov method.

rate_markov(cr_data, ..., fill = list(), stoch_modify = teleport(0.15),
  weights = 1, force_nonneg_h2h = TRUE)

rank_markov(cr_data, ..., fill = list(), stoch_modify = teleport(0.15),
  weights = 1, force_nonneg_h2h = TRUE, keep_rating = FALSE,
  ties = c("average", "first", "last", "random", "max", "min"),
  round_digits = 7)

Arguments

cr_data	Competition results in format ready for as_longcr().
...	Name-value pairs of Head-to-Head functions (see h2h_long()).
fill	A named list that for each Head-to-Head function supplies a single value to use instead of NA for missing pairs (see h2h_long()).
stoch_modify	A single function to modify column-stochastic matrix or a list of them (see Stochastic matrix modifiers).
weights	Weights for different stochastic matrices.
force_nonneg_h2h	Whether to force nonnegative values in Head-to-Head matrix.
keep_rating	Whether to keep rating column in ranking output.
ties	Value for `ties` in `round_rank()`.
round_digits	Value for `round_digits` in `round_rank()`.

Value

rate_markov() returns a tibble with columns player (player identifier) and rating_markov (Markov rating). The sum of all ratings should be equal to 1. Bigger value indicates better player performance.

rank_markov returns a tibble with columns player, rating_markov (if keep_rating = TRUE) and ranking_markov (Markov ranking computed with round_rank()).

Details

Markov ratings are based on players 'voting' for other players being better. Algorithm is as follows:

'Voting' is done with Head-to-Head values supplied in ... (see h2h_mat() for technical details and section Design of Head-to-Head values for design details). Take special care of Head-to-Head values for self plays (when player1 equals player2). Note that Head-to-Head values should be non-negative. Use force_nonneg_h2h = TRUE to force that by subtracting minimum Head-to-Head value (in case some Head-to-Head value is strictly negative).
Head-to-Head values are transformed into matrix which is normalized to be column-stochastic (sum of every column should be equal to 1) Markov matrix S. Note that all missing values are converted into 0. To specify other value use fill argument.
S is modified with stoch_modify to deal with possible problems behind S, such as reducibility and rows with all 0.
Stationary vector is computed based on S as probability transition matrix of Markov chain process (transition probabilities from state i are elements from column i). The result is declared as Markov ratings.

Considering common values and structure of stochastic matrices one can naturally combine different 'votings' in one stochastic matrix:

Long format of Head-to-Head values is computed using ... (which in this case should be several expressions for Head-to-Head functions).
Each set of Head-to-Head values is transformed into matrix which is normalized to column-stochastic.
Each stochastic matrix is modified with respective modifier which is stored in stoch_modify (which can be a list of functions).
The resulting stochastic matrix is computed as weighted average of modified stochastic matrices.

For Head-to-Head functions in ... (considered as list) and argument stoch_modify general R recycling rule is applied. If stoch_modify is a function it is transformed to list with one function.

weights is recycled to the maximum length of two mentioned recycled elements and then is normalized to sum to 1.

Ratings are computed based only on games between players of interest (see Players).

Design of Head-to-Head values

Head-to-Head values in these functions are assumed to follow the property which can be equivalently described in two ways:

In terms of matrix format: the more Head-to-Head value in row i and column j the better player from row i performed than player from column j.
In terms of long format: the more Head-to-Head value the better player1 performed than player2.

This design is chosen because in most competitions the goal is to score more points and not less. Also it allows for more smooth use of h2h_funs from comperes package.

Players

comperank offers a possibility to handle certain set of players. It is done by having player column (in longcr format) as factor with levels specifying all players of interest. In case of factor the result is returned only for players from its levels. Otherwise - for all present players.

References

Wikipedia page for Markov chain.

Examples

rate_markov(
  cr_data = ncaa2005,
  # player2 "votes" for player1 if player1 won
  comperes::num_wins(score1, score2, half_for_draw = FALSE),
  stoch_modify = vote_equal
)
#> # A tibble: 5 x 2
#>   player rating_markov
#>   <chr>          <dbl>
#> 1 Duke          0.0876
#> 2 Miami         0.438 
#> 3 UNC           0.146 
#> 4 UVA           0.109 
#> 5 VT            0.219 

rank_markov(
  cr_data = ncaa2005,
  comperes::num_wins(score1, score2, half_for_draw = FALSE),
  stoch_modify = vote_equal
)
#> # A tibble: 5 x 2
#>   player ranking_markov
#>   <chr>           <dbl>
#> 1 Duke                5
#> 2 Miami               1
#> 3 UNC                 3
#> 4 UVA                 4
#> 5 VT                  2

rank_markov(
  cr_data = ncaa2005,
  comperes::num_wins(score1, score2, half_for_draw = FALSE),
  stoch_modify = vote_equal,
  keep_rating = TRUE
)
#> # A tibble: 5 x 3
#>   player rating_markov ranking_markov
#>   <chr>          <dbl>          <dbl>
#> 1 Duke          0.0876              5
#> 2 Miami         0.438               1
#> 3 UNC           0.146               3
#> 4 UVA           0.109               4
#> 5 VT            0.219               2

# Combine multiple stochastic matrices and
# use inappropriate `fill` which misrepresents reality
rate_markov(
  cr_data = ncaa2005[-(1:2), ],
  win = comperes::num_wins(score1, score2, half_for_draw = FALSE),
  # player2 "votes" for player1 proportionally to the amount player1 scored
  # more in direct confrontations
  score_diff = max(mean(score1 - score2), 0),
  fill = list(win = 0.5, score_diff = 10),
  stoch_modify = list(vote_equal, teleport(0.15)),
  weights = c(0.8, 0.2)
)
#> # A tibble: 5 x 2
#>   player rating_markov
#>   <chr>          <dbl>
#> 1 Duke          0.305 
#> 2 Miami         0.308 
#> 3 UNC           0.103 
#> 4 UVA           0.0936
#> 5 VT            0.191