Saturday, August 31, 2019

Some of my least favorite words

Well, this stopped being a weekly blog pretty quickly, but I seem to have a trend of blogging about things I think are bad or could be better. So, maybe that's a sign I haven't felt that way about things recently? Following that trend, here's some of the many words I feel are bad as linguistic constructs (not for the reasons other "bad words" are bad) and we could do without.

queue

Okay let's start with some low-hanging fruit/upsetting anyone from the U.K. who happens to read this. I think queue as a part of the English language is great. It is good that the art of queueing is a cultural phenomenon in the U.K. and I hope it will find its way into the U.S., somehow most grocery stores don't get that one shared line for multiple registers is a direct upgrade and people still struggle with the idea that you shouldn't get on a BART car or elevator until the people on it get off. I think having the word queue as a synonym of line to distinguish the way Englishmen line up from us is great. Queues are also a great data structure, and I can remember that BFS is the one that uses a queue by just remembering "barbeque". But like, why did the French see the Middle French "queu", a word that in the French pronunciation is pronounced like how I would think to pronounce "qu" and then added a third useless letter? (My theory is they foresaw keyboards and they just wanted to be able to practice trills while typing.)

terrific

So if you take a combination of one of the prefixes horr- and terr- and one of the suffices -or, -ibble, and -ific, you get a word with negative connotations except for if you combine terr- and -ific and you get terrific, which is a good thing. Terrific upsets me not just for this reason, but because we already have so many words that mean "good" and have a bit of flourish to them (awesome, amazing, extraordinary, incredible, stupendous...) that terrific doesn't even have the defense of "it adds diversity to one's vocabulary" because the synonyms of "good" are past the point where having more will really add much diversity.

twelfths

The e in twelfths is carrying a lot of baggage in the pronunciation of that word. Just getting through the (count 'em) five consonants smushed together requires me to put some amount of thought into how I'm moving my tongue so I don't just say "twelfs" instead (I admit I'm not the best at speaking in general, though). I've never been more aware of the shapes my tongue makes while I pronounce things since I took an acoustics class. I think the language issue here is that the numbers 11-19 have a different naming scheme than 21 onwards, instead of calling them "tenny-one, tenny-two" or something like that (which after letting it sit for a bit doesn't sound that dumb, it's like "Lord Tennyson"). "Tenny-seconds" is much easier to say than twelfths and is consistent with the rest of the numbers' names. It even dodges the issue that "Sixty-seconds" has, of sounding like you're talking about a minute and not the fraction 1/62. As another alternative, it could just be pronounced "twelveth" instead so there's a reasonable break between the consonants, and this also has the benefit that it mirrors how the rest of the fractions 1/11 to 1/19 just add a "-th" to the corresponding word without modifying it.

bookkeeper

This word is annoying because it has three instances of doubled letters in a row. I'm a fairly fast typer, (not professional grade but I can beat my friends in typeracer which is the real test so I may as well be). But I really struggle to type this word correctly without focusing very hard because my fingers just can't comprehend that I want them to double-tap keys three times in a row and in the middle of typing it I start to feel like there should also be two Ps because why not, O K and E already did it. Here's some tries at typing the word without backtracking:

bookepeer
bookkeeeper
bookeeper
bookkeeper
bookkeeper
bookeeper
bookeeper
bookkeepr

I think my K key might not be functional because I swear I double tapped it for several of those attempts which came out as "bookeeper". But maybe I just mind-tricked myself because I double-tapped O and E before and after.

This word is actually pretty fine, most of the issues come from typing it quickly, writing it isn't an issue because it's slower and there's less dexterity needed. But triple-doubles should stick to being a basketball-and-Taco-Bell-only thing.

tertiary

So this word follows secondary, and I feel like it should be "thirdary" but hey maybe there's another prefix for words related to the number three that sounds better or something. Something like "triary", because we use tri- as a prefix for three all the time! Nope, it's tertiary. I can't think of any other word that uses the prefix "terti-" in a manner meaning of or relating to three. Just to confirm this, I looked up what words in the dictionary start with "tert-" (since I'm honestly not sure whether "tert" or "terti" is doing the work of a prefix here) and I found tertial, tertian, and tertullian. Tertial is something to do with feathers, tertian means having to do with paroxysms that occur every other day (so it's really more to do with the number two) and tertullian, which is just someone's name. So we already have a perfectly good and common prefix in "tri" and we just randomly decided to borrow from a new root just for this one word, without even doing the same for related words.

floccinaucinihilipilification

I had to learn this word for a spelling test in 5th grade (I still remember how to spell it; I typed it here without looking it up). That's actually not the reason this word is on the list, it's fun to spell and say and when you're a nerdy 5th grader who was memorizing digits of pi for fun, learning how to spell a finitely long word with much more structure to it than irrational numbers is great. Also, I don't think it's right for anyone whose parents' last name is "Vaidyanathan" to have strong feelings about words being longer than necessary.

It's moreso the fact that a man named William Shenstone took four Latin words that all mean the same thing and smashed them together in a letter as what I assume was an attempt to be dramatic (with hyphens between them by the way, so "flocci-naucci-nihili-pili-fication"; it is very clear this is not him trying to invent a new word), and then a bunch of other people said he was inventing a new word as a joke and ran with it. It's the 1800s equivalent of if I sent my friend a text saying something was "super-duper-turbo-fantastic" and then next thing you know superduperturbofantastic is a word in the Oxford dictionary because people thought it was funny. It's also a word that more or less only exists because people think it's funny such a word exists, and just devalues having a centralized authority of sorts who determines what words are "real" and make the cut for the dictionary. If people are upset that "literally" can be redefined to mean "not literally but used for emphasis anyway" I think those people should also be upset floccinaucinihilipification is in the dictionary, because it definitely has less value and common use.

(Also, the act of saying "cinihilipilifi" is dumb).

Tuesday, August 6, 2019

Coding in an Algorithms Class

Like in most CS programs, Berkeley undergraduates usually take CS 170, officially named "Efficient Algorithms and Intractable Problems" (Even though intractable problems are maybe 20% of the course. Unofficially, the course has a much better title of "CS Theory" on some of the department webpages). Despite being part of the CS program, this class requires almost no coding; the only coding assignment is a final project which only accounts for 5% of the final grade, and the nature of the assignment is very unlike the rest of the work in the class (when I TA'd, I think less than 10% of students used an algorithm technique they used on a homework to tackle the assignment).

Instead, as the alternate title for the course suggests, it's a course on the theoretical aspects of CS. Namely, students learn about different techniques for designing algorithms and towards the end some other parts of CS theory. In an average week, a student will learn about a technique like dynamic programming, divide and conquer, etc. and then use it solve homework and test questions of the form "here's a problem, write an algorithm that solves it quickly, prove your answer is correct." Briefly speaking, it's a class where you code on paper, blissfully ignoring the many issues of real code, and just focus on designing good algorithms.

Recently, I've been of the mind that the removal from coding may actually be a net negative (Part of this may have to do with the exciting news that Jelani Nelson, a theoretical computer scientist who has a background in participating in coding competitions and coaches the Harvard ICPC team, recently joined Berkeley). That is, I think that there are many pedogogical gains to be made at a minor cost by turning maybe 20-30% of the homeworks into coding problems instead of math problems. The intent isn't to replace the proof-based problems, but rather accompany them: Maybe e.g. a question is cut from the all-proofs homework, and then students are asked to instead spend that time coding up their on-paper solutions to the remaining problems.

Transfer skills from other courses

The main benefit of switching to a more coding-focused model is that I think many CS majors who make it to the theory courses have a lot of coding classes on their transcript already, and ideally a lot of CS fundamentals they got from those classes, and these fundamentals should transfer over but don't, but maybe would if we used coding to draw them out.

My experience from TAing undergraduate theory is that students don't know how to debug on-paper algorithms very well (which is not necessarily their fault, especially at a school like Berkeley where the reality is there are too many students for the amount of resources we have it's more likely we just can't do a good job teaching all of them). Sometimes, they just don't think their solution is wrong; they write a proof that they don't have the experience to tell is "hand-wavey", and feel like their incorrect algorithm is correct. The fact that some number just go to office hours and effectively ask TAs to debug for them rather than learn how to tell when they're correct doesn't help.

These same students usually take some number of coding classes where they are required to debug much more intricate programs (e.g. the dynamic programs we ask them to write can usually be implemented in ~10 lines of Python using barely anything besides lists and for loops, much shorter than a single method in an average project in the intro coding class). They should be familiar with the idea of generating test cases, running through code step-by-step to find issues, etc. These techniques aren't 1-to-1 applicable to debugging on-paper algorithms, but I think some of the fundamentals behind them still are. For example, finding examples that trick your code is a skill that definitely transfers to verifying paper algorithms, and comes up a lot in research (and if anything is even more important; when a correct algorithm can take years to show is correct, you want to be able to quickly show your incorrect algorithms are incorrect).

I think something about the proof-based class setting leads students to fail to apply the iterative refinement approach to coding they learn in other classes to designing algorithms and writing proofs. Maybe it's that in order to know your algorithm is correct you need to know your proof is correct, and then it seems like turtles all the way down, instead of a bunch of green passes on unit tests. My hope is that by adding coding portions to small algorithmic problems, students are encouraged to transfer over some of the debugging skills they learn in other classes to the theory setting, and then have more time to develop the skills more unique to the theory class rather than relearn the transferable skills. If we structure the questions so they are encouraged to write code before proofs, maybe by the time they're sure their code is correct, they'll also have an idea about how to prove it's correct as well.

Besides this reason, I think there are several smaller benefits:

Bridge the gap between theory and practice


A (I think somewhat misguided) criticism of computer science theory and encouraging/requiring industry-minded students to take theory classes is that theory is somewhat removed from practice and what someone may be doing in a day-to-day SWE job. If I am right about this being misguided, we should be able to give students concrete examples that show how the algorithmic techniques are used in practice by the companies they're hoping to work for (for example, maybe we have a homework problem on the greedy algorithm for max-cut, and then we ask them to implement and run it on a real-life data set to solve a community detection problem that a SWE for a social media company). This comes with the pedagogical benefit that students will be happier and more motivated to learn something if they are convinced it is practical.

Prepare for coding interviews

When they're optional, many students take algorithm design classes because the homework problems in them are the closest analog to questions in coding interviews. If they're taking them for this reason, then we may as well mimic the coding interviews somewhat by having students actually code.


Theoreticians do code (sometimes)

Even for people who are trying to go to grad school in CS theory, even if they are not going to work on research in algorithms, coding can be a great tool for just testing conjectures. If they're already in the CS major, taking the CS class, they may as well learn how to use the computer part of computer science to test theoretical ideas.

Easier to grade

This is far from the most important point to me, but after the initial hurdle of setting up autograders, coding questions are easier to grade than hand-written proofs. One worry is that an important part of algorithms classes is getting the right runtime. For most algorithms students are asked to design in these classes, the runtime is something like linear or quadratic, and so a test case of size 10^5 would run pretty quickly for the right answer but still be reliable in distinguishing between students who get the best runtime and those that use simpler but slower algorithms (whereas distinguishing between e.g. cubic and quartic runtimes might be a bit harder to do).

Downsides

There are admittedly a few downsides I see to trying to shift to a more coding-based theory class. First, there is the need to setup the infrastructure for the coding aspects of the class. But, this is a non-recurring time investment that e.g. one could hire some undergraduates to figure out over a summer. Second, theory TAs might not be the best at TAing for coding assignments. But again, the coding aspects should be small compared to assignments in other classes, and will ideally only use very simple coding primitives, so chances most of the issues with code will be theoretical rather than coding issues.

The main downside I can think of is that the more time students spend on coding, the less time they spend on the theory they're supposed to learn. Like most of math, CS theory is something you learn by applying it over and over, and rarely do ideas stick the first time, so we might be sacrificing a lot of understanding by exposing students to fewer examples in order to have them to code as well. My counterarguments for why this might not really be too huge a downside are

1) while students may get exposure to less examples, they will likely get more out of each example,
2) students, especially those on the lower end of the grading curve (who maybe are trying to get the assignments done rather than doing them as a learning exercise), probably aren't actually getting much more out of doing many many examples if they weren't deeply connecting with the first few.
3) if students do have time and want more problems to work on, it shouldn't be that hard to supply them with some,
4) for topics like divide and conquer and dynamic programming, there's really only 3 or 4 core examples that need to be understood to solve 99% of the problems a student might see.