Monday, February 10, 2020

How we can do fairer randomized seeding, and why we should

Very generally speaking, Melee tournaments are (to the best of my understanding) basically all seeded using the following method: Players are sorted into tiers, and then the seeding within each tier is shuffled uniformly at random. The TO then makes adjustments afterwards to account for concurrent waves, seeding conflicts, people playing their friends, etc. This is the method natively supported by smash.gg.

Keep in mind this is the most general form; there's two more specific versions of this that are commonly used:
-If everyone is in their own tier, then there's no randomness (there's only one way to shuffle a list of one player) and this is the method most TOs of small tournaments use, just ordering the players by skill.
-Seeding the top X players without randomizing, and then past that splitting players into tiers (see e.g. this article and note that this is what is done at majors).

The main problem with completely deterministic seeding (ignoring inaccuracy due to misperceptions of skill) is the same matchups between the high seeds are likely to occur repeatedly across multiple tournaments, if most of the same people attend the same tournaments regularly. This can cause a player's average performance in bracket to not be reflective of their skill relative to the other players. For example, on the Summer 2019 MPGR Axe was ranked 3 and Zain was ranked 6, which means that if all the top 6 players attended a tournament and were seeded according to the MPGR, Zain would be likely to play in Axe in winners at that tournament. This means Zain's average bracket performance at tournaments using this ranking would be more reflective of how he does in one of his hardest player matchups, rather than how he does against the field.

Randomizing seeding fixes this by making sure there's more variance in who plays who at tournaments. However, "randomizing within tiers" as described above has a problem of fairness: the best player in a tier can only be underseeded, and the worst player in a tier can only be overseeded. Another way to look at this is: treating tier 1 as the best tier, tier 2 as the second best, etc., the skill gap between the worst tier n player and the best tier n+1 player may be much smaller than the skill gap between the best tier n and worst tier n player. However, crossing the former skill gap is much more rewarding to a player than crossing the latter. At majors, due to many different regions competing, it may be very hard to even tell who the top or bottom of a tier is, so this might be an unaddressable issue. But at locals, stagnating matchups are still an issue (if not more of an issue than at majors - at locals it's more likely that there's a lot of head-to-head data and thus a consensus ordering of the top seeds). So we may want to still use randomized seeding at a local level, where it is addressable.

To formalize the problem a bit more, suppose the players have trusted ratings (where a higher rating = higher skill). Let's say our randomized seeding scheme has monotonicity if the probability of seeding player 1 with rating r_1 above player 2 with rating r_2 is increasing in r_1 - r_2. That is, the larger the skill gap between two players, the more likely it is that the stronger player is seeded over the weaker one. Randomizing within tiers does not satisfy monotinicity. Rephrasing the example from before: when player 1 is at the bottom of tier n-1 and player 2 at the top of tier n, r_1 - r_2 could be very small but player 1 is always seeded above player 2. When player 1 is the top of tier n and player is at the bottom of tier n and the tier is seeded using a uniformly random shuffle, r_1 - r_2 could be very large but player 1 is still seeded below player 2 50% of the time.

A very simple fix is to instead use the following method of randomization. Initially, player i has an assigned rating r_i. We choose a distribution X, and for every player draw an independent sample x_i from X. We then sort players by r_i + x_i (let's assume X is continuous, so there's almost surely no ties).

First, note that if we as a TO have the flexibility to choose the ratings ourselves, this is a generalization of randomizing within tiers. If we assign everyone in tier n the rating -n, and X is a random variable always between, say, 0 and 1/2, then anyone in tier n can't ever have r_i + x_i exceed -(n-1), i.e. can't ever jump ahead of a player in tier n-1. This means e.g. if we do believe everyone in a tier is at the same skill level, we can use this scheme to emulate randomizing within tiers, but at the same time this scheme is much more flexible.

Next, note that if we are given the ratings (by, say, an outside source or our own best judgment) and apply this scheme, it satisfies monotonicity. Proof: Player 1 will be seeded above player 2 if r_1 + x_1 > r_2 + x_2, i.e. with probability Pr[x_1 - x_2 > r_2 - r_1]. Since x_1, x_2 are independent samples from X, x_1 - x_2 has the same distribution regardless of what r_1, r_2 are, and for any random variable Y, Pr[Y > r_2 - r_1] is increasing in r_1 - r_2, giving the monotonicity property that randomizing within tiers lacks.

Finally, implementing this method of randomizing seeding is very easy. In 5-10 minutes I made a spreadsheet here implementing this, using the noise distribution X = Uniform(0, 1/2) (chosen because it's bounded, i.e. won't ever seed me over Hungrybox since our skill difference is large). To refresh the randomized seeding, you just have to edit any cell (in your own copy, since it's view-only). The first two columns are freely editable (just don't cut them), so one can e.g. copy in a table generated from a site like GarPR or Braacket, and perhaps edit the third column to tune the noise distribution according to one's preferences (e.g. more variance, maybe a more light-tailed or heavy-tailed distribution).

Friday, January 31, 2020

The "Finding a Restaurant" Problem

There's a toy problem I thought of while walking around Oakland (which, at least where we were, was a grid) that I tweeted about, but felt like writing more. Basically, my friend and I were trying to get food in Oakland while attending an event, and he was being dumb about not using Google Maps to decide where to get food first so we basically just decided to walk around until we found something we liked. We wanted to go back to the event afterwards, so e.g. walking in a straight direction was not the right thing to do, and we walked in a spiral instead, but was this the best thing to do?

Here's a model for this scenario: You start at (0, 0) on the integer 2-D lattice. You walk on some path consisting of only orthogonal moves on the 2-D lattice (so up/down/left/right moves only). Every time you visit a new point on the lattice, you have a probability p chance of "finding a restaurant you like", at which point your path ends and you return to the origin (using the shortest path of only orthogonal moves, i.e. if you find the restaurant at (x, y) you move |x|+|y| steps back to the origin). What path minimizes the expected number of steps you take? (Both on the way to the restaurant and back to the origin, though if the path only visits new locations, it's equivalent to just minimize the expected length of the path to the origin).

The conjecture is that the "spiral" (up 1, right 1, down 2, left 2, up 3, right 3...) is the optimal path, but it's unclear (at least to me) how to prove it. Namely, this seems to be a greedy strategy, but it's unclear what exchange argument if any proves it.

If we are allowed diagonal moves, even just on the way back, or the grid is isometric instead, it's fairly straightforward to show the spiral is correct: You visit all the locations distance i from the origin before spending any time visiting a location distance i+1 from the origin, and you waste no time visiting new locations, so given any other path, via an exchange argument you can show the expected number of steps the spiral takes is better than the other path's.

We can also show the spiral is a 3/2 approximation, using the fact that the spiral is optimal if we are allowed diagonal moves on the way back, as the ratio between any solution's expected cost if allowed diagonal moves and its actual expected cost differ multiplicatively by at most 3/2. 3/2 comes from the fact that you pay a factor of at most 2 on the path back to the origin, but the path back comprises at most half your total length.

Furthermore, the ratio between expected costs can easily be improved to some function of p with domain (1, 3/2) with a more refined analysis, since at most points in the spiral the distance to the origin is much less than the distance walked to reach that point and/or we are losing much less than a factor of 2 on the way back to the origin, and my guess is that this constant will end up being very close to 1 for lots of p, effectively resolving the problem for (limited) practical purposes. However, this analysis alone still falls short of proving optimality.

Saturday, August 31, 2019

Some of my least favorite words

Well, this stopped being a weekly blog pretty quickly, but I seem to have a trend of blogging about things I think are bad or could be better. So, maybe that's a sign I haven't felt that way about things recently? Following that trend, here's some of the many words I feel are bad as linguistic constructs (not for the reasons other "bad words" are bad) and we could do without.

queue

Okay let's start with some low-hanging fruit/upsetting anyone from the U.K. who happens to read this. I think queue as a part of the English language is great. It is good that the art of queueing is a cultural phenomenon in the U.K. and I hope it will find its way into the U.S., somehow most grocery stores don't get that one shared line for multiple registers is a direct upgrade and people still struggle with the idea that you shouldn't get on a BART car or elevator until the people on it get off. I think having the word queue as a synonym of line to distinguish the way Englishmen line up from us is great. Queues are also a great data structure, and I can remember that BFS is the one that uses a queue by just remembering "barbeque". But like, why did the French see the Middle French "queu", a word that in the French pronunciation is pronounced like how I would think to pronounce "qu" and then added a third useless letter? (My theory is they foresaw keyboards and they just wanted to be able to practice trills while typing.)

terrific

So if you take a combination of one of the prefixes horr- and terr- and one of the suffices -or, -ibble, and -ific, you get a word with negative connotations except for if you combine terr- and -ific and you get terrific, which is a good thing. Terrific upsets me not just for this reason, but because we already have so many words that mean "good" and have a bit of flourish to them (awesome, amazing, extraordinary, incredible, stupendous...) that terrific doesn't even have the defense of "it adds diversity to one's vocabulary" because the synonyms of "good" are past the point where having more will really add much diversity.

twelfths

The e in twelfths is carrying a lot of baggage in the pronunciation of that word. Just getting through the (count 'em) five consonants smushed together requires me to put some amount of thought into how I'm moving my tongue so I don't just say "twelfs" instead (I admit I'm not the best at speaking in general, though). I've never been more aware of the shapes my tongue makes while I pronounce things since I took an acoustics class. I think the language issue here is that the numbers 11-19 have a different naming scheme than 21 onwards, instead of calling them "tenny-one, tenny-two" or something like that (which after letting it sit for a bit doesn't sound that dumb, it's like "Lord Tennyson"). "Tenny-seconds" is much easier to say than twelfths and is consistent with the rest of the numbers' names. It even dodges the issue that "Sixty-seconds" has, of sounding like you're talking about a minute and not the fraction 1/62. As another alternative, it could just be pronounced "twelveth" instead so there's a reasonable break between the consonants, and this also has the benefit that it mirrors how the rest of the fractions 1/11 to 1/19 just add a "-th" to the corresponding word without modifying it.

bookkeeper

This word is annoying because it has three instances of doubled letters in a row. I'm a fairly fast typer, (not professional grade but I can beat my friends in typeracer which is the real test so I may as well be). But I really struggle to type this word correctly without focusing very hard because my fingers just can't comprehend that I want them to double-tap keys three times in a row and in the middle of typing it I start to feel like there should also be two Ps because why not, O K and E already did it. Here's some tries at typing the word without backtracking:

bookepeer
bookkeeeper
bookeeper
bookkeeper
bookkeeper
bookeeper
bookeeper
bookkeepr

I think my K key might not be functional because I swear I double tapped it for several of those attempts which came out as "bookeeper". But maybe I just mind-tricked myself because I double-tapped O and E before and after.

This word is actually pretty fine, most of the issues come from typing it quickly, writing it isn't an issue because it's slower and there's less dexterity needed. But triple-doubles should stick to being a basketball-and-Taco-Bell-only thing.

tertiary

So this word follows secondary, and I feel like it should be "thirdary" but hey maybe there's another prefix for words related to the number three that sounds better or something. Something like "triary", because we use tri- as a prefix for three all the time! Nope, it's tertiary. I can't think of any other word that uses the prefix "terti-" in a manner meaning of or relating to three. Just to confirm this, I looked up what words in the dictionary start with "tert-" (since I'm honestly not sure whether "tert" or "terti" is doing the work of a prefix here) and I found tertial, tertian, and tertullian. Tertial is something to do with feathers, tertian means having to do with paroxysms that occur every other day (so it's really more to do with the number two) and tertullian, which is just someone's name. So we already have a perfectly good and common prefix in "tri" and we just randomly decided to borrow from a new root just for this one word, without even doing the same for related words.

floccinaucinihilipilification

I had to learn this word for a spelling test in 5th grade (I still remember how to spell it; I typed it here without looking it up). That's actually not the reason this word is on the list, it's fun to spell and say and when you're a nerdy 5th grader who was memorizing digits of pi for fun, learning how to spell a finitely long word with much more structure to it than irrational numbers is great. Also, I don't think it's right for anyone whose parents' last name is "Vaidyanathan" to have strong feelings about words being longer than necessary.

It's moreso the fact that a man named William Shenstone took four Latin words that all mean the same thing and smashed them together in a letter as what I assume was an attempt to be dramatic (with hyphens between them by the way, so "flocci-naucci-nihili-pili-fication"; it is very clear this is not him trying to invent a new word), and then a bunch of other people said he was inventing a new word as a joke and ran with it. It's the 1800s equivalent of if I sent my friend a text saying something was "super-duper-turbo-fantastic" and then next thing you know superduperturbofantastic is a word in the Oxford dictionary because people thought it was funny. It's also a word that more or less only exists because people think it's funny such a word exists, and just devalues having a centralized authority of sorts who determines what words are "real" and make the cut for the dictionary. If people are upset that "literally" can be redefined to mean "not literally but used for emphasis anyway" I think those people should also be upset floccinaucinihilipification is in the dictionary, because it definitely has less value and common use.

(Also, the act of saying "cinihilipilifi" is dumb).

Tuesday, August 6, 2019

Coding in an Algorithms Class

Like in most CS programs, Berkeley undergraduates usually take CS 170, officially named "Efficient Algorithms and Intractable Problems" (Even though intractable problems are maybe 20% of the course. Unofficially, the course has a much better title of "CS Theory" on some of the department webpages). Despite being part of the CS program, this class requires almost no coding; the only coding assignment is a final project which only accounts for 5% of the final grade, and the nature of the assignment is very unlike the rest of the work in the class (when I TA'd, I think less than 10% of students used an algorithm technique they used on a homework to tackle the assignment).

Instead, as the alternate title for the course suggests, it's a course on the theoretical aspects of CS. Namely, students learn about different techniques for designing algorithms and towards the end some other parts of CS theory. In an average week, a student will learn about a technique like dynamic programming, divide and conquer, etc. and then use it solve homework and test questions of the form "here's a problem, write an algorithm that solves it quickly, prove your answer is correct." Briefly speaking, it's a class where you code on paper, blissfully ignoring the many issues of real code, and just focus on designing good algorithms.

Recently, I've been of the mind that the removal from coding may actually be a net negative (Part of this may have to do with the exciting news that Jelani Nelson, a theoretical computer scientist who has a background in participating in coding competitions and coaches the Harvard ICPC team, recently joined Berkeley). That is, I think that there are many pedogogical gains to be made at a minor cost by turning maybe 20-30% of the homeworks into coding problems instead of math problems. The intent isn't to replace the proof-based problems, but rather accompany them: Maybe e.g. a question is cut from the all-proofs homework, and then students are asked to instead spend that time coding up their on-paper solutions to the remaining problems.

Transfer skills from other courses

The main benefit of switching to a more coding-focused model is that I think many CS majors who make it to the theory courses have a lot of coding classes on their transcript already, and ideally a lot of CS fundamentals they got from those classes, and these fundamentals should transfer over but don't, but maybe would if we used coding to draw them out.

My experience from TAing undergraduate theory is that students don't know how to debug on-paper algorithms very well (which is not necessarily their fault, especially at a school like Berkeley where the reality is there are too many students for the amount of resources we have it's more likely we just can't do a good job teaching all of them). Sometimes, they just don't think their solution is wrong; they write a proof that they don't have the experience to tell is "hand-wavey", and feel like their incorrect algorithm is correct. The fact that some number just go to office hours and effectively ask TAs to debug for them rather than learn how to tell when they're correct doesn't help.

These same students usually take some number of coding classes where they are required to debug much more intricate programs (e.g. the dynamic programs we ask them to write can usually be implemented in ~10 lines of Python using barely anything besides lists and for loops, much shorter than a single method in an average project in the intro coding class). They should be familiar with the idea of generating test cases, running through code step-by-step to find issues, etc. These techniques aren't 1-to-1 applicable to debugging on-paper algorithms, but I think some of the fundamentals behind them still are. For example, finding examples that trick your code is a skill that definitely transfers to verifying paper algorithms, and comes up a lot in research (and if anything is even more important; when a correct algorithm can take years to show is correct, you want to be able to quickly show your incorrect algorithms are incorrect).

I think something about the proof-based class setting leads students to fail to apply the iterative refinement approach to coding they learn in other classes to designing algorithms and writing proofs. Maybe it's that in order to know your algorithm is correct you need to know your proof is correct, and then it seems like turtles all the way down, instead of a bunch of green passes on unit tests. My hope is that by adding coding portions to small algorithmic problems, students are encouraged to transfer over some of the debugging skills they learn in other classes to the theory setting, and then have more time to develop the skills more unique to the theory class rather than relearn the transferable skills. If we structure the questions so they are encouraged to write code before proofs, maybe by the time they're sure their code is correct, they'll also have an idea about how to prove it's correct as well.

Besides this reason, I think there are several smaller benefits:

Bridge the gap between theory and practice


A (I think somewhat misguided) criticism of computer science theory and encouraging/requiring industry-minded students to take theory classes is that theory is somewhat removed from practice and what someone may be doing in a day-to-day SWE job. If I am right about this being misguided, we should be able to give students concrete examples that show how the algorithmic techniques are used in practice by the companies they're hoping to work for (for example, maybe we have a homework problem on the greedy algorithm for max-cut, and then we ask them to implement and run it on a real-life data set to solve a community detection problem that a SWE for a social media company). This comes with the pedagogical benefit that students will be happier and more motivated to learn something if they are convinced it is practical.

Prepare for coding interviews

When they're optional, many students take algorithm design classes because the homework problems in them are the closest analog to questions in coding interviews. If they're taking them for this reason, then we may as well mimic the coding interviews somewhat by having students actually code.


Theoreticians do code (sometimes)

Even for people who are trying to go to grad school in CS theory, even if they are not going to work on research in algorithms, coding can be a great tool for just testing conjectures. If they're already in the CS major, taking the CS class, they may as well learn how to use the computer part of computer science to test theoretical ideas.

Easier to grade

This is far from the most important point to me, but after the initial hurdle of setting up autograders, coding questions are easier to grade than hand-written proofs. One worry is that an important part of algorithms classes is getting the right runtime. For most algorithms students are asked to design in these classes, the runtime is something like linear or quadratic, and so a test case of size 10^5 would run pretty quickly for the right answer but still be reliable in distinguishing between students who get the best runtime and those that use simpler but slower algorithms (whereas distinguishing between e.g. cubic and quartic runtimes might be a bit harder to do).

Downsides

There are admittedly a few downsides I see to trying to shift to a more coding-based theory class. First, there is the need to setup the infrastructure for the coding aspects of the class. But, this is a non-recurring time investment that e.g. one could hire some undergraduates to figure out over a summer. Second, theory TAs might not be the best at TAing for coding assignments. But again, the coding aspects should be small compared to assignments in other classes, and will ideally only use very simple coding primitives, so chances most of the issues with code will be theoretical rather than coding issues.

The main downside I can think of is that the more time students spend on coding, the less time they spend on the theory they're supposed to learn. Like most of math, CS theory is something you learn by applying it over and over, and rarely do ideas stick the first time, so we might be sacrificing a lot of understanding by exposing students to fewer examples in order to have them to code as well. My counterarguments for why this might not really be too huge a downside are

1) while students may get exposure to less examples, they will likely get more out of each example,
2) students, especially those on the lower end of the grading curve (who maybe are trying to get the assignments done rather than doing them as a learning exercise), probably aren't actually getting much more out of doing many many examples if they weren't deeply connecting with the first few.
3) if students do have time and want more problems to work on, it shouldn't be that hard to supply them with some,
4) for topics like divide and conquer and dynamic programming, there's really only 3 or 4 core examples that need to be understood to solve 99% of the problems a student might see.

Friday, July 26, 2019

Settlers of Catan is a bad game

Settlers of Catan is probably one of the most important board games out there. In part because it's probably the most common "gateway game", i.e. game that gets people into "serious" board games. In part because it's a board game that people who aren't "serious" board game fanatics often happily congregate around. I can recall many people coming to our undergraduate board game club who were only interested in playing Catan, and even just hearing people on the buses making plans to play Catan Friday evening (alongside, you know, the other usual recreational activities of substance that undergraduates do on a Friday night). I too was enamored with Catan at one point during my high school days, and had several consecutive weekends of meeting with friends just to play Catan.

The title of this post tells you where it's going, and I do want to preface the meat of this post by saying that yes, I understand Catan has brought many people joy, that it is fundamentally very important and I and others who have enjoyed the population explosion of board gaming owe much to it. But, there have been many times in my life where would ask to play Catan and I had to turn them down and politely bottle up the many negative feelings about the game that had accumulated during my period of fanaticism for it, so that these people would not think I hated board gaming or that people who played games like Catan were very judgmental.

Thankfully Catan is no longer the popular game of choice amongst my friends, but I still have these feelings bottled up and would like to let them out. So, here's a blog post on the many reasons I think Catan is a bad game and not worth my time to play. This is unfortunately going to be a post that doesn't make much sense if you've never played Catan but... surely you've played Catan, right? I'm also talking about the base game here, the only expansion I've played is the one that adds 2 more players (which only made the game worse, unfortunately).

Let's quickly go over a few reasons why Catan isn't great, and then I'll get to the nail in coffin.

Luck based

You can have the highest expected number of resources earned per round, but no tile generates resources more than 5/36 of the time (which means that no tile generates resources at least once in four rolls with probability greater than 45%, which especially in the early game when you may only have one tile that generates a certain resources is painful). so it's very easy to get screwed by rolls. This isn't helped by the fact that if you lose half your hand if a 7 is rolled while you're over the hand limit, so sometimes you'll get a sequence of rolls that gives you a lot of resources and be punished for it.

This is unrelated to the rest of the post, but I feel like Catan's luck is very feel-bad. It feels really bad to go many turns without getting the one resource you need, and when you get it in a timely manner it's sort of what you're "expecting". If there were a way to make tiles generate resources more consistently and not ruin the pacing I might not mind the luck as much.

Snowbally 

The way to get VP is to build more settlements/cities, which give you more resources to get more settlements/cities, etc. So if the players are playing correctly (see below), the game will probably be mostly decided after one person gets a small lead that can snowball. Sure, the other players can gang up on the leader by blocking their resource gains with the robber but a) you can only block one tile, whereas each player should have settlements/cities next to at least six tiles at any given moment, and b) the leader should have enough resources to hoard knights in defense.

Trading

Trading is a pretty useless mechanic after early-game when played optimally. Usually you don't want to take trade offers once you have enough of an engine going to secure resources on your own. As we'll discuss soon, it's not like you need or want a large diversity of resources in order to win, which is the main thing trading enables, and if you have a monopoly on a resource you definitely don't want to be giving away the advantage of that monopoly. In addition, the person making an offer usually will only make that offer if it provides them a non-marginal benefit. I can remember a game where my opponent was desperately trying to trade for a wheat to finish a settlement while I was sitting on a hand with four wheat. The benefit of not giving him that wheat far outweighs whatever cards they were offering which I probably didn't really need that much.

One way to play

Now, let's establish what is widely considered the best Catan strategy. I'm not a Catan world champion or anything, but I once read a blog post written by one (which I can't seem to find, but if you Google "Catan strategy" you usually find suggestions like this one) so I'm obviously about as qualified. More or less, the key thing to know is that wheat and ore are good and you want as much as you can. Your goal at the start of the game should of course to be to maximize the number of resources you're getting per roll, so you can snowball into the late game. There are two ways to do this, build a new settlement or upgrade a city.

Building a new settlement by adding two roads to reach a new spot and then building a settlement there requires 3 brick, 3 wood, 1 sheep, and 1 wheat. On the other hand, upgrading a settlement to a city requires 2 wheat and 3 ore. Both give you 1 more VP and roughly the same resource gain long-term, but the latter requires many fewer cards, so you probably want to do it as much as you can. So the optimal production cycle in terms of needing the fewest cards is, road->road->settlement->city, which in total costs 3 brick, 3 wood, 1 sheep, 3 wheat, and 3 ore.

This might make it look like you want a balanced spread of resources, but we haven't accounted for many things. First, development cards require wheat/ore and you'll probably want to be picking these up alongside the cycle of building cities to protect your engine (plus, if you can play the most knights, you'll get the 2 VP for largest army, which is nice because conveniently, cities alone can only get you 2 VP from winning). Notably, if you get locked out of wheat/ore by the robber, you're also locked out of moving the robber and are just dependent on the die rolls, so strategies that don't build multiple wheat/ore are more susceptible to getting stalled by the robber. Second, you start with two free settlements, and you should be putting these settlements in the best place available on the map. So your starting settlements are the ones you want to upgrade the most, and at the start of the game upgrading one of these to a city requires less resources/time than trying to build a new settlement. Third, if you end up producing "too much" wheat/ore, you can always trade it in for the other resources. So it's better to take a strategy where you can get a lot of wheat/ore rather than try to have a balanced but much smaller amount of all five resources. I've had multiple games where I was gladly 4-for-1ing away wheat just to get the brick/wood I was missing, because I was so far ahead due to the burst of the wheat/ore start and the fact that my opponents didn't have nearly as much wheat as I did and struggled to do anything but building roads. Fourth, you start with two free roads, so if your opponents don't beat you to the settlement spots you build towards, you can reduce the cycle down to road->settlement->city

So really, the ideal way to win is to build city->city->road->settlement->city->road->settlement->city, which requires 4 brick, 4 wood, 2 sheep, 10 wheat, and 12 ore, and then seal the last two points ideally by winning largest army which will require at least 3 sheep/wheat/ore for a total of 4 brick, 4 wood, 5 sheep, 13 wheat, and 14 ore (regardless of whether you win by largest army or not, you'll want to pick up knights to keep resources coming). You may have to in some edge cases resort to building settlements or getting longest road instead, but because of how snowball-y the game is, once you've slammed down the fourth city, you're probably so far ahead of your opponents that it doesn't really matter. Sometimes you may have to pay a few more brick/wood if your starting roads end up being worthless, but the general mix of resources required is still heavily skewed towards wheat/ore.

Putting it all together

So it's clear now that we need a disproportionate amount of wheat/ore in the ideal path to victory, and wheat/ore also enables the fastest start of upgrading two cities. At this point, you maybe forgot that this was a blog post about why Catan is a bad game instead of a strategy guide. Well, the fact that there is one real strategy to go for compounds with the other factors. This is basically how a game of Catan would go for me once I learned the wheat/ore strategy:

1) Place starting settlements. The quality of these depends on the luck of who gets to place when (and maybe on the tile layout if you randomize that). You might argue that the snake-draft placement of settlements alleviates the luck here, but given the dominance of the wheat-ore strategy, the quality of settlements will probably fall off fairly early in the starting placement process. For example, in the default setup, there are two spots that are really good for the wheat/ore strategy, so the first/second settlements are way better than the rest. Again, wheat/ore are much better than the other resources, so it's not like if you get locked out of a good wheat/ore spot you can take a good brick/wood spot instead and expect to compete.
2) Try to get the best start. Upgrading your first two cities is pretty mindless. Building the next two settlements, it's also usually fairly obvious where you want to go with them (wherever pushes the wheat/ore agenda further) down to the luck of rolls.
3) Once someone has a good start and is generating more resources than everyone else, because of the snowbally nature of the game, they will just win more until the game is over.

As discussed before, trading doesn't really help build alliances in response (usually how multiplayer games handle this) because the wheat/ore strategy is so much better, and doesn't require a diversity of resources, which is all a trading alliance really provides. In addition, there is good reason to be skeptical of trade offers, especially mid-game. So, the first few turns are very RNG-heavy, and then past that the player who has the best resource engine will probably win and the remainder of the game is just an exercise in watching them do so. I'm not saying that I've solved Catan, again, I'm not a world champion. But in this version of the Catan experience:

1) You're not really strategizing too much, you're just using this strategy that people have done the math on and determined is the best
2) Even with the best strategy, you're largely relying on luck-based mechanics to progress it which can be very frustrating.
3) Since the main interactive mechanics are trading (which as we discussed before, is often not worth doing) and the robber (which, once you have the best resource engine, you get to more or less ignore due to knights), you're not really socializing much.

So I stopped playing.

Sunday, July 21, 2019

Berkeley Time (and why it, frankly, sucks)

I'm going to try to start making short blog posts on a (approximately?) weekly basis. They'll be about whatever, maybe my work, maybe games I've been playing, maybe just random topics I want to talk about. The main goal is to just to be writing something on a regular basis, so that when it comes to writing important things (i.e. papers) I might be more "in practice". Maybe no one reads them, but that's fine. Most people don't care about people e.g. watching them do practice runs before a marathon, but that practice still has value, so hopefully the same principle could apply here.

Today I'm gonna write about "Berkeley time" (which is not actually a Berkeley-specific phenomenon, but I think it's the most egregious here) and why it upsets me so much. Basically, at Berkeley if a class is scheduled to start at e.g. 1:00, the professor won't actually start lecturing until 1:10. The reason this is the standard (or at least, what I hope is the reason) is because class times at Berkeley all start at XX:00 or XX:30, and class lengths are all multiples of half hours, so it's not uncommon that a student will have a class ending at 2:30 and another starting at 2:30, on opposite sides of campus. The 10 minute delay gives this student time to run across campus and be there for the entirety of both lectures. Again, this isn't unique to Berkeley, I know of other schools that have their own variants.

Before I talk about the easy, obvious fix for this, let me talk about the many many reasons why I find it so annoying.

Unclear what it applies to 
So for lectures, Berkeley time always applies. But, lectures are not the only things happening on a college campus. I attend talks, meet with professors and other students, go to reading groups, go to trainings, go to club meetings, and do many more things. This is as a grad student, the position on the academic totem pole which I think would have the least hard commitments in our schedule (because ideally, most of our time is being spent on research). For undergraduates and professors with much more packed schedules, there are probably many other commitments they have.

Does Berkeley time apply to these other events? I think the subset of these events where it was clarified if we'd start 10 minutes late was a minority, but it wasn't the case that a minority started at the stated time instead of Berkeley time. Without clarification, the time spent on these events is much less useful: If half the people attending something think it starts on Berkeley time and the other half don't, then either the latter half waste 10 minutes waiting for the former, or the former miss out on 10 minutes of this event (both have happened to me).

Inconsistent with other, off-campus events

Now, for on-campus events like the ones I mentioned before, it might be arguable that they should all start 10 minutes late, in case anyone has a class right before these events that would like to attend these events. But, these aren't the only events in our lives. For example, Berkeley has a great resource called the Simons Institute, where researchers working in various areas of CS theory come from all over the world to spend a semester collaborating and sharing their research. There are many great talks that I as a CS theory student would like to attend, but these don't start on Berkeley time (well, usually they don't but sometimes they do and it's not always announced whether they do or not: see previous point) because the majority of people at Simons are not Berkeley-affiliated. So now, if I have class that ends right before a talk at Simons, I'm inevitably going to be late (as opposed to the obvious fix mentioned below, which would give me time to make it to the Simons event).

I also have collaborators from other universities who I'd like to meet with via video chat, and we usually set our meeting times to XX:00 or XX:30 like normal people do. But, this means sometimes I'll be late to these meetings because the Berkeley event schedule assumes it's okay to show up 10 minutes late to an event starting on the hour (or because I'm just wired to show up to everything 10 minutes late: see next point). Oh, and it gets fun when the other campus has its own starting time standard that isn't 10 minutes late (I've seen 3, 5, and 7 mentioned as how many minutes late everything starts on some campuses).

And even for non-work reasons, there are even just events in the town of Berkeley which don't operate on Berkeley time, which means sometimes one needs to rush to them from one's on-campus events or just can't schedule them at a time it seems like one should be able to. Both "important" things like doctor's/dentist's appointments, and more fun things like pub trivia when you need to claim a table.

Sets a bad standard

So before being at Berkeley when I had to show up to things on normal-person time, I usually made it a point to try to be 5 minutes early to events in my schedule, and ashamed to be 1-2 minutes late. Now, I find myself cutting it close more often, and not minding being 5 minutes late so much. Maybe this is just a personal thing, maybe I'm just biased in my perceptions of when I showed up to events before/after coming to Berkeley, but regardless I do feel like the standard of Berkeley time does affect my (and maybe others) tendency to show up to things on time. I also don't think it helps that no one ever officially explained this to me, I just kind of caught on after my first few classes that this was happening. I won't dwell on this one too much, but I don't like that there is an aspect of our campus culture that implicitly is okay with showing up "late" to things.

There's an obvious and easy-to-implement fix

Everything could just start on time and officially end 10 minutes earlier. A 1:00-2:00 event could just run officially from 1:00 to 1:50, instead of officially from 1:00 to 2:00 and unofficially at 1:10 to 2:00.
That's it. An announcement that this is the new standard is literally all that needs to happen to get rid of all the issues with Berkeley time, while allowing people the same amount of time to get between classes and other commitments. The best part is, classes still "officially" start at 2:00 instead of 2:10 in the course registration system, in schedules published online, etc. So it's not like we're officially changing the start time of anything, and if anything updating all publicly listed event times is only being more honest. The only downside is, people have to shift their sleep schedule 10 minutes earlier, but I hardly think that's an issue. It's not like I'm advocating for a revolution of the campus schedule, because under this change it's really not changing at all relative to itself, and it's better aligning with the rest of the world.

I've thought about this a fair amount and can't think of any reason why someone in the administration shouldn't just snap their fingers and make this fairly small change happen (besides, well, the usual gripes people have with campus administrations), and that's really what frustrates me most of all about Berkeley time.