No ranking system will ever be perfect, and USAU gets lots of criticism for minor differences on the margins. But there are some serious problems that still need to be addressed -- and maybe some innovative solutions that could be employed.
September 4, 2014 by Sean Childers in Opinion with 78 comments
Ranking teams isn’t easy.
This is what many people get incredibly wrong in their criticism of USA Ultimate and the bid allocation system.
Even in a perfect world, ranking teams isn’t easy. That’s true whether you use computers or humans (polls, power rankings, expert forecasts) to do it. Do you look to reward for regular season performance to date or try to predict future outcomes? Do you favor straight wins and losses or consider strength of schedule and point differentials?1
Ranking is even harder when you consider all the extra difficulties of ranking ultimate club teams. Star players miss key tournaments, teams play in few national competitions, weather cancels important games. USAU also has to balance multiple directives, and the semi-pro leagues lurk in the background as a suddenly legitimate alternative.
It’s not easy.
But we can do better than this.
It’s year two of the Triple Crown Tour — year three of ranking-based bid allocation – and I’d venture to say our record is 0-3. In year one, the Capitals sat out the regular season – leaving just two bids for the Northeast women’s – and showed up to Regionals just in time to steal it from teams who had earned it.2.
Last year, a change in one point changed the final bid allocation and gave the Northwest (Rhino) a bid from the Southwest (Streetgang).
This year has exposed the flawed system even more: Prairie Fire dominated average competition all season long before holding enough at Chicago Heavyweights to keep their bid.3 Prarie Fire got the majority of the interconnectivity attention this year4. But the real questionable case is the Northwest Mixed, where a crazy lack of interconnectivity against out of region teams has caused some highly questionable results. The Northeast Mixed region5, traditionally a strong mixed region, is down to two bids this year for three stellar teams — Ghosts, Slow White, and Wild Card (plus a few other good squads like Union and 7Express).
To their credit, USAU makes improvements each year. Elite teams can’t wait to play until Regionals6. Important concavities on the point curve are adjusted. Teams and tournaments have to turn scores in earlier.
But it’s whack-a-mole. The amount of gaming is only going to go up, and it’s only a matter of time before something truly scandalous happens – unless you think it hasn’t already. The comments are going to come in this article and remind me which other egregious outcomes I’ve forgotten. And it really only takes one or two bad apples to truly throw this system apart.
Even if you believe that USAU can keep improving outcomes, you also need to ask yourself whether gaming by teams will increase in both quantity and effectiveness. Can we always stay one step ahead of the curve without overfitting to past results, or without giving USAU some odd discretionary clause to attack suspected teams?
To some extent, theater and closecalls are inherent in the system, and people need to stop expecting USA Ultimate to come up with a 100% controversy-free allocation. That is never going to happen — and you only need to look as far as NCAA bracketology or BCS formulas to realize a hundred-fold increase in resources wouldn’t solve everything.
But it’s time to be more aggressive in implementing solutions. And the first step of improvement is realizing what’s causing the problem.
Causing the Problems:
- Not Enough Games: File this under the “multiple-mandate” problem. USAU wants teams to play more tournaments. Teams don’t want to spend more. Computer rankings need large sample sizes.
- Not Enough Interconectivity: Another multiple-mandate issue. To have a meaningful regular season, USAU needs tiering. The tiering leads to less interconnectivity – especially since TCT teams are apparently allowed to drop out of their “play-up” tournament. We’re lucky we haven’t reached the potential hypothetical where a team gets the bid by beating B teams, simply due to lack of interconnectivity, but that’s possible in this system. Since we know it’s possible, so USAU can’t hide behind a “We’ll address it when it comes up excuse” — take proactive steps now.
- Canadian teams:7 Higher travel costs, more international competition, Canadian Nationals, and a serious commitment to semi-pro leagues all contribute to Canadian club teams simply caring way less about the TCT. GOAT has shown up to the Pro Flight Finale – USAU’s elite TCT tournament! – with 16 players for two straight years. This isn’t an isolated incident, it’s a reality of the world and one that should be addressed in some manner.
- 100% Computer Based: We’re big believers in computers, formulas, and statistics. But you’re always at risk of oddball results when you only rely on that information.
- Require Pre-Season Rosters: Stop teams from using TCT events as tryout tournaments. If you want to be included in the final regular season rankings for bid purposes, then you need a 23 person roster submitted on July 1. That still gives each team a final 4 spots to tinker with, remove a few people with injures, etc.
- Incorporate Last Year’s Nationals Finish: Finish in the top 12, return more than 70% of your roster, and play in the TCT? Automatic bid for your region (unless you are taking the automatic bid); you still have to earn it at Regionals, of course. You could also just include each team’s performance last season as a small part of their ranking in this season, so that it tilts the scale in tight cases.
The (Soon-to-be) Necessary Fix:
- Stop Inter-Region Gaming by Disqualifying Some Results: This is quite the wildcard solution; many teams depend on games against nearby opponents to reach the 10 game minimum. Removing games from the formula that already suffers from small sample size is immediately suspect. But this area is ripe for extreme and increased exploitation going forward, like blowout round robins set up at the end of the season. There are middle grounds: Disqualify games in the region after a certain date (it’s hard to know how to game the system midway through), or disqualify a random 50% of your in-region games (it’s hard to game the system if you don’t know which games are going to count).
The Radical Solution:
Create a selection committee. Let the media vote.
I know what you’re thinking — easy for you to say, and quite pompous Ultiworld! But hear me out:
- Reality is that subjective votes matter in others sports, such as NCAA football and NCAA basketball. This isn’t breaking ground.
- It doesn’t need to count for all of the allocation. You can combine a committee system with a computer system and weight each — start by weighing the computer more if you are skeptical of selection.
- 10 games will never be enough sample size for this algorithm to produce great results. Some teams play more, but some teams play less. With worlds years, weather, and gaming, it’s impossible to know exactly what sample size to even design for.
- Introducing a subjective element doesn’t mean any single system. You can design it a lot of ways.
Here’s how I might start:
- Selection committee of three: One media organization (like Ultiworld), one writer/coach with minimal club ties who writes on the divisions8, and one vote for a person or pair of people who work at USAU.
- Each member has to publish their rankings of teams ranked 1-40 (ties allowed), along with justifications.
- The average of those rankings is averaged with the computer rankings, using some amount of weighting (e.g. 60% computer, 40% committee), to create a final bid allocation.9.
- At the end of the season, the top 6 teams in each region get to vote on whether to keep specific members. This is where the justifications come in key. Want to replace Ultiworld with Bamasecs? Want to get rid of the coach vote altogether? Go ahead and cast your vote. But smart teams will want the best-informed votes for the next season. There’s too many votes out there to game your buddy onto the selection committee, so the only way you can be safe is to try and reward smart behavior and punish facially poor or misinformed voters.
USAU counters to the idea of a subjective system by pointing out its weaknesses.
“There simply isn’t enough information out there, especially at the margins, to accurately rely on subjective input that is sufficiently informed and unbiased,” says USAU spokesperson Andy Lee. “While the amount of media attention and video coverage is growing, there are relatively few games, even among the top teams, available for subjective input, much less of the teams competing for bids at the margins. And there certainly aren’t any accountable, objective scouts out there watching these teams in person on a regular basis that can provide that kind of information.”
This argument misses and misunderstands the point. There’s not enough information out there for a great selection committee, but there’s also not enough information out there for a great algorithm. It’s becoming increasingly clear that it’s not the imperfect algorithm’s fault that we see weird allocations, but our imperfect club frisbee scene. In this backdrop, there can be no perfect solution. But a system that is a mix of objective formulas and subjective views would be an improvement over the current one.
USAU is right to bring up concerns about bias, and I would be extremely skeptical of a system that just allowed every team to vote in a poll for the final rankings. But I find it a bit ironic that a sport where the default expectation is that players can behave with spirit, sportsmanship, and make unbiased calls on the field, in the heat of the moment! — that sport can’t find a few members of its community it trusts and then implement systems to hold them accountable?
I do think we can do better — but it might require us taking some more dramatic steps than feels initially comfortable. I worry that we are going to hold back on making real changes and wait until things get worse.
***Correction: An earlier version of this article incorrectly implied that the Capitals did not play in TCT events this season
This question is intrinsically linked to the first. Prioritizing wins over schedule and point differentials is less predictive of future results. Ranking is hard in part because we look to sports to offer order, clarity, and to answer “who’s best?” – when we should all know by now that sports is one of the most anything can happen, matchup-driven, magically spontaneous places left in the world (despite Fury and Revolver’s best late 00’s attempts to convince us otherwise ↩
Correction: This year, the Capitals did play in their TCT events. However, our understanding is that they only planned to finalize their roster around or after the Pro Flight Finale and many players who have played with Capitals this season (including some who played with them at the Pro Flight Finale) will not be on the series roster ↩
Any discussion of the bid situation should keep this in mind: Prairie Fire didn’t have to attend Chicago. It sounds like I’m hating on Prairie Fire, but I’m not. They played the games in front of them. They won them — dominated them, in fact. There is no suggestion that they gamed the rankings even though they probably could have done so. ↩
The Prairie Fire bid would be a bigger deal but for Sub Zero. Sub — a team expected to compete for quarterfinals at Nationals — was missing so many pivotal players at key tournaments this year that they dropped out of the top 16. Smart money is on them to take Prairie Fire’s bid back at regionals and limit the injustice, so to speak ↩
Disclosure: Ultiworld is based in the Northeast and, to some extent, Northeast-biased ↩
but see Furious ↩
We love our Canadian readers! This isn’t to say that the Canadian teams are making a bad choice and they’re just playing with the rules in the system — but that their behavior does create some TCT problems ↩
such as Skyd’s Lou Burruss or our Tiina Booth ↩
There’s the matter of actually averaging rankings and the “Power Ranking” that USAU produces. There’s lots of ways you could do this: have the committee create their own power ranking numbers, back them out by converting one figure to the other, ignore the power rankings once you get to the averaging step — I’m not sure yet what the best way would be, but lots of things would work. ↩