Monday, February 10, 2020

How we can do fairer randomized seeding, and why we should

Very generally speaking, Melee tournaments are (to the best of my understanding) basically all seeded using the following method: Players are sorted into tiers, and then the seeding within each tier is shuffled uniformly at random. The TO then makes adjustments afterwards to account for concurrent waves, seeding conflicts, people playing their friends, etc. This is the method natively supported by smash.gg.

Keep in mind this is the most general form; there's two more specific versions of this that are commonly used:
-If everyone is in their own tier, then there's no randomness (there's only one way to shuffle a list of one player) and this is the method most TOs of small tournaments use, just ordering the players by skill.
-Seeding the top X players without randomizing, and then past that splitting players into tiers (see e.g. this article and note that this is what is done at majors).

The main problem with completely deterministic seeding (ignoring inaccuracy due to misperceptions of skill) is the same matchups between the high seeds are likely to occur repeatedly across multiple tournaments, if most of the same people attend the same tournaments regularly. This can cause a player's average performance in bracket to not be reflective of their skill relative to the other players. For example, on the Summer 2019 MPGR Axe was ranked 3 and Zain was ranked 6, which means that if all the top 6 players attended a tournament and were seeded according to the MPGR, Zain would be likely to play in Axe in winners at that tournament. This means Zain's average bracket performance at tournaments using this ranking would be more reflective of how he does in one of his hardest player matchups, rather than how he does against the field.

Randomizing seeding fixes this by making sure there's more variance in who plays who at tournaments. However, "randomizing within tiers" as described above has a problem of fairness: the best player in a tier can only be underseeded, and the worst player in a tier can only be overseeded. Another way to look at this is: treating tier 1 as the best tier, tier 2 as the second best, etc., the skill gap between the worst tier n player and the best tier n+1 player may be much smaller than the skill gap between the best tier n and worst tier n player. However, crossing the former skill gap is much more rewarding to a player than crossing the latter. At majors, due to many different regions competing, it may be very hard to even tell who the top or bottom of a tier is, so this might be an unaddressable issue. But at locals, stagnating matchups are still an issue (if not more of an issue than at majors - at locals it's more likely that there's a lot of head-to-head data and thus a consensus ordering of the top seeds). So we may want to still use randomized seeding at a local level, where it is addressable.

To formalize the problem a bit more, suppose the players have trusted ratings (where a higher rating = higher skill). Let's say our randomized seeding scheme has monotonicity if the probability of seeding player 1 with rating r_1 above player 2 with rating r_2 is increasing in r_1 - r_2. That is, the larger the skill gap between two players, the more likely it is that the stronger player is seeded over the weaker one. Randomizing within tiers does not satisfy monotinicity. Rephrasing the example from before: when player 1 is at the bottom of tier n-1 and player 2 at the top of tier n, r_1 - r_2 could be very small but player 1 is always seeded above player 2. When player 1 is the top of tier n and player is at the bottom of tier n and the tier is seeded using a uniformly random shuffle, r_1 - r_2 could be very large but player 1 is still seeded below player 2 50% of the time.

A very simple fix is to instead use the following method of randomization. Initially, player i has an assigned rating r_i. We choose a distribution X, and for every player draw an independent sample x_i from X. We then sort players by r_i + x_i (let's assume X is continuous, so there's almost surely no ties).

First, note that if we as a TO have the flexibility to choose the ratings ourselves, this is a generalization of randomizing within tiers. If we assign everyone in tier n the rating -n, and X is a random variable always between, say, 0 and 1/2, then anyone in tier n can't ever have r_i + x_i exceed -(n-1), i.e. can't ever jump ahead of a player in tier n-1. This means e.g. if we do believe everyone in a tier is at the same skill level, we can use this scheme to emulate randomizing within tiers, but at the same time this scheme is much more flexible.

Next, note that if we are given the ratings (by, say, an outside source or our own best judgment) and apply this scheme, it satisfies monotonicity. Proof: Player 1 will be seeded above player 2 if r_1 + x_1 > r_2 + x_2, i.e. with probability Pr[x_1 - x_2 > r_2 - r_1]. Since x_1, x_2 are independent samples from X, x_1 - x_2 has the same distribution regardless of what r_1, r_2 are, and for any random variable Y, Pr[Y > r_2 - r_1] is increasing in r_1 - r_2, giving the monotonicity property that randomizing within tiers lacks.

Finally, implementing this method of randomizing seeding is very easy. In 5-10 minutes I made a spreadsheet here implementing this, using the noise distribution X = Uniform(0, 1/2) (chosen because it's bounded, i.e. won't ever seed me over Hungrybox since our skill difference is large). To refresh the randomized seeding, you just have to edit any cell (in your own copy, since it's view-only). The first two columns are freely editable (just don't cut them), so one can e.g. copy in a table generated from a site like GarPR or Braacket, and perhaps edit the third column to tune the noise distribution according to one's preferences (e.g. more variance, maybe a more light-tailed or heavy-tailed distribution).

No comments:

Post a Comment