Race ranking discussion

Firstly well mindful that this is a repeat of what has been said many time. Pen Enforcement and Category assignment is completely different from Ranking . Discussing them as if they are two different aspects of the same thing with the same acceptance criteria and objectives and in this case with the same rules around cheating is probably going to be less than fruitful.

In both cases you are showing instance of “cheating” for advantage .
All cheating for advantage is the same and what has to be targeted , that has a context , not the fact you move up or down , that is a specific

In a pen enforcement , cheating to allow you to enter into a lower group.
In ranking , cheating to enter into a higher group.

  • Both can be seen as cheating to your advantage

however

in a pen enforcement event , cheating to ensure to are in a higher group that you are going to be good at , doesn’t seem to be something we should be concerned about .

Likewise “cheating” to be in or remain in a lower division that you are capable of seems similarly not in itself an issue . its only of value if you continue to under perform , doesn’t worry me too much anyway cruise around at the back of the field or even the middle of it ,see if I care . Some might even suggest that is a domestique encapsulated :rofl:

The solution is quite simple, if the number of cheaters is larger than the number of honest drivers, then the honest ones should accept that racing on zwift is pointless. So leave the subject alone and train on zwift with your friends. That the races are pointless can be seen, for example, in the fact that real professionals can not keep up in the races on zwift. You can see every day how professionals are left behind by untrained noobs. That is proof enough for me.

1 Like

I think there is a distinction to be made between cheating that is very much within Zwift’s control and can easily be enforced in code (say category rules or the mid-race weight/height change kerfuffle) and cheating that isn’t (say poorly calibrated power sources).

4 Likes

well, maybe. i’m not good enough at it to express myself with math, i’m only comig up with reasons it doesn’t fit into my own personal use of zwift, and i’m sure other people do the same things i do in races. it could work, possibly. it might work for more people than it doesnt work for, i don’t know how other people like to race on zwift, i only know myself.

it needs to be an option an organiser can turn off and on for their race series and not the default setting, i’m at least firm on that. i wonder how many race organisers would actually use any implementation of it though… a lot of the best racers in each category already gravitate towards eachother for at least one or two heavily attended races per week, since you cannot gain rank or race your peers if you don’t…

would i personally want it for every little race i do, probably not. sometimes i’ll enter a race that has maybe three other people in it, because it’s at a convenient time, or because it’s on a course that matches the intervals i want to do

i’m also skeptical of zwift’s ability to implement something this complex without screwing it up. no offense to zwift

Yes this is surely a much better way of putting it than what I wrote. Things like poorly calibrated power meters and erroneous height/weight will pretty much always be with us, which is why Flint’s focus:

is so misguided. Zwift can spend as long as they like on that Augean Stable, but it would be more productive to design a racing system that actually works, even in the presence of over-reading power meters. This is not impossible, they just have to want to do it.

2 Likes

@xflintx

Some comments on things mentioned:

I never would have thought you dared to even think about tightening up hardware requirements in (a subset of) racing. And… yeah… somehow I think intentional or neglectful “cheating” is what stings the most for us racers. But at the same time… we tend to sort it under “uncontrollable variables” and try to forget the fact that we might be racing against a miscalibrated Tacx Flux that could give an even bigger advantage than someone’s cruising sometimes. But to us it has been like weight. You can’t control weight. Well, you could in theory, but it would be asking too much right now to ask of everyone to buy some special wifi scales that don’t even exist on the market yet. From our perspective Zwift gives the impression of being very very careful with complicating things further for new subscribers or making the basic setup even more expensive than it is already.

However, if you have the guts to take it into consideration then… yeah, go for it! I might even agree with a decision to postpone results-based categorization for a worthy matter like that, I just don’t want it to be buried and never dug up again (and you would know I wouldn’t… as usual).

I mean, if you come to me tomorrow and say I need a secondary power source to join a race from now on, then I’d buy one. But that’s just me and not everyone is like that. Not in this virtual world where some people spend thousands of dollars on the n + 1 and even expensive trainers but are still so cheap they can’t upgrade to a computer made in the last decade so that you are forced to support OS’s from an earlier millennium. And then some can’t afford the n + 1 to begin with.

But not every race has to have tough requirements like that. Neither with regards to hardware nor softer legitimacy like ranks and measures. I’m all for the idea of variety in racing. How fun wouldn’t it be with e.g. real hare and hounds races where strangers actually worked together like intended and in good spirit? Fun and games. And for that you need to lighten up racing. Lower perceived entry barriers. Not be so serious. All that. Draw in the crowd.

Then we also need to toughen up racing at the same time. Some of the races. To keep that crowd that will mature and whose preferences may change. We may have misunderstood your point, but I personally don’t like the idea of a rank score that just sits there. Its primary purpose across pens is for creating these pens, reasonable and enjoyable ones, one way or the other. Not necessarily in all races, but in some, the “competitive” ones. The dynamics of results-based categorization is so terribly important to any sport, the sound mobility it provides for the riders. Were you tall, short, heavy, light? Are you a good sprinter or better in the hills? A tactician or a crazy Jensie? Is your 20 min better than your 2 min or the other way around? It doesn’t matter. You work what you have to your advantage. It’s the result that matters, and only that. And a good result should be rewarded - with upward mobility, elevated to a higher ground, not with the promise that you can stay the big fish in the small pond forever.

What do you do with the rank score? Keep it forever? Reset it sometimes? Most rank systems in online games will suffer from a slow inflation, typically in the top end (the upper tail of the bell curve will get drawn out and no one will be able to catch up with the guys there). Even ELO in chess, a comparatively super stable system, has suffered in periods. We discussed it in here 2 years ago or so.

So regardless of what would be the ideal or vision, you would probably have to reset it every once in a while to combat inflation, so better make it a planned, recurring, foreseeable thing. Once per year/indoor season maybe? And then everyone needs to recalibrate. Better make that a thing too, a scheduled event period, something fun and exciting. That way you also get to pronounce the winners of the season before the recalibration (and reevaluate behind the curtains - do we need a little tweaking before next season?)

4 Likes

Except this isn’t true, the elite end of Zwift isn’t absolutely full of people cheating, there will be people I’m sure, but benchmarking against me there is no way it is the majority or even a hugely sizeable minority.

I am a good cyclist, I know this, but I am not special. However there are increasingly few people beating me and ranked better than me. If the number that still are is significantly inflated by cheats then I am too good to be true on a worldwide platform.

And yet I know I’m not cheating; my height and weight are accurate and I’m dual recording on two of the most trusted products on the market (Kickr and Favero Duos). If anything my higher reading power meter, at high power at least, when races are being won, is the back up (pedals) not the game controlling power.

So, I can accept lots of opinions about cheating and controlling categories, but actually the idea that professionals can’t keep up with the front of an elite race because those there are cheating, cannot be accurate.

PS there was a pro in our Wahoo Le Col race on a Saturday, he kept up fine and even went clear on his own before he got brought back by the blob. He didn’t win but then he has a broken colllerbone and it was a sprint finish.

1 Like

Yeah, pros doing badly in Zwift races isn’t generally due to the others cheating but rather unfamiliarity with the physics and tactics of a Zwift race.

2 Likes

Except @Hiltja_Schnell has a point though.

Real-life story:
I’m not a persuader. I don’t have “it”, that thing, whatever it is, that makes you persuasive. That’s why I need to compensate with annoying persistence here. However, while being critical to Zwift racing, I do love Zwift as a concept (or I wouldn’t bother to be here in this forum). And so I have tried in vain to persuade many people to try out Zwift, people I know would get hooked quickly and thank me later. So far I have only succeeded once. A bit unexpectedly, my sister-in-law and her dude, a former amateur triathlete, suddenly caved in and bought a Kickr Snap. “Good starter choice”, I said and reminded them to always keep 8 bar pressure and calibrate regularly. I never rode with her dude outdoors (never dared to), but a fair bet was he was stronger than me.

Anyway, they got hooked on Zwift alright, and then the texts started coming. You know, the worst kind of Zwift related texts, like “Yay! I won again!” (in the Companion App…) or “Why did I suddenly get a polka dot jersey? Is that a bad thing? Does it mean anything?” (Yes, I actually got a text like that.) And I was like “…”

I was starting to wonder though, but I had to tread carefully. Neither should have to lose face over this and I didn’t want to come across as a rear end cavity myself. But did they even do the initial calibration when they set up their Snap? I would have put him as a B, a fair guess, but A? Well, maybe… And I knew she used to be fit, but B and all her previous bike experience was a bit of spinning? Not impossible, although…

However, after a move they had to set up the trainer again and remembered something I said about calibration, so I provided a step-by-step instruction. This time they followed it. And after that there was complete radio silence for quite a while. Poor guys, they were so ashamed. Not their fault. Had they listened to me the first time, though, it would have saved them the embarrassment.

There are elite races with dual recording requirements. I assume that is typically the case when RL pros are involved too. And then it probably comes down more to, like @Anna_Ronkainen suggests, familiarity, but also motivation and a vastly different power curve. But when there is no dual recording requirement and with a variety of hardware among participants to boot, the little story above is probably not an uncommon scenario. We also know there are intentional cheats up there, at the top level.

I’m not so worried about cheating in the elite races though. There will always be external pressure on Zwift from sponsors and partners to keep things clean up there. They will always do what they can because they have to. What we are primarily discussing in here is “cheating” created by weird race rules and unsuitable categorizations. And that pertains not to the elite (at all in fact) but to the lower level racing. Per definition, since it has to do with artifical ceilings to how well you are allowed to do in races (the elite don’t have that ceiling) and how racers are still allowed (hopefully not for long now) to circumvent this, some kind of double standards.

2 Likes

8 posts were split to a new topic: Power meter for races

I don’t know enough about amateur chess to speak to it, but just being 1v1 is a big difference.

More to the point, the answer to Flint’s question is still exactly that: it doesn’t matter. For Zwift, you don’t need to design any kind of incentives around Ranking. You could, but definitely don’t need to, and it should not be a blocker for implementing it. Ranking facilitates racing, and it does so better than any other system. You can innovate on top of it but only when it’s in place. So get it in place, and iterate.

If people want to compare themselves… go for it? Nothing stopping you, no harm in doing that because once again the only way to be better in that comparison is to do exactly what we want folks to do — race bikes against strong(er) opponents.

1 Like

Sure.

I think the word itself causes some (mis)apprehension if it invokes the sense of a scoreboard. Ranking — to me — is strictly the mechanism to create a numeric representation of a rider’s relative strength.

So, implementation-wise, it’s kind of enough to just have the Ranking. It could be left strictly to organizers to determine how it’s used, but it’s probably much easier for the average racer if there’s a global default categorization based on it. It gives you a quick baseline of expectations and a quick way to refer to it. Say, using arbitrary numbers, you could have 0–600 Ranking and each 100 forms a category A–F. Or it could be divided into 10 categories by default, and maybe most organizers would combine most of them in pairs. Most likely it’d gravitate toward a similar spread of fields for most races anyway, but if there’s a global default, then it’s easier to tune to find those good cutoff points.

Are you saying Zwift doesn’t have an issue with people racing in the wrong fields? Bold :slight_smile:

2 Likes

Gerrie
Forum users have picked up on Flint’s post in respect of a 2nd power source.

Do you feel these posts might be better in their own thread, making the subject easier to find for other forum users and removing them from this race ranking thread?

Is it appropriate for me to ask you to move them?

Thanks

2 Likes

Agreed I realised after my last post I was down a tangental rabbit hole. Will stick to the OP unless split :slight_smile:

1 Like

Many things that are “physiologically impossible” should trigger some kind of warning or even block the use of Zwift until the user correct the situation.

Guys climbing the Alpe in sub30; certain height/weight combos; people doing 500W for minutes and minutes; etc

3 Likes

Totally fine to ask. I moved the power meter discussion. Power meter for races - #8 by OleKristian

2 Likes

thank you

I just read through the thread more thouroughly. Something someone wrote, I may have misunderstood, but I just felt I needed to make doubly sure everyone understands a certain point:

You cannot mix ranking with performance based categories.

And this is why some of us have been saying that the purpose of ranking can’t be to just sit there and provide bragging rights. Ranking, if you go that way rather than points, is the driver behind a results based categorization. A nice ranking is certainly something to brag about but it’s not its fundamental point. It exists in a certain context. It’s part of a mechanics. Put that cog in the wrong place and wheels won’t turn anymore.

Think about it. Assume we tidy up the ZP ranking or create something similar and keep the old or the old-with-a-new-twist (test model) categories, what will happen? The same thing that forces us to tidy up the ZP ranking before it is of any use, that’s what will happen.

You want to have ranking be as linear as possible to results and hence to past-results-as-proxy-to-future-results. It will never be perfect if you have pens but as long as there is upward mobility between cats and as long as that mobility is driven by results, then ranking can stay linear enough. (Actually, rank distribution is not linear, it’s bell shaped, but you get the point - you can’t have bumps, gaps and overlaps, you want smooth.)

In a world where ranking is not what decides your category, you will get rank overlaps between pens and other weird phenomena. The top of a performance based cat can keep perfecting their rank indefinitely. Depending on design it may flatten out at some point and stop growing. But that’s just it. It should never flatten out because you keep winning (unless you are the best of the best of all riders in all categories). It should only flatten out because you fail to improve further. And whether those at the top of a performance based category can still improve is a question we don’t always have an answer to since they are not getting promoted. For a healthy rank system, any rank has to be able to face competition from both somewhat lower and somewhat higher ranks. And for that to happen there needs to be enforced promotion based on rank, and not highschool physics.

So no, you can absolutely not mix rank with performance based categories. It’s a bad idea. It will not work. It defeats the purpose of rank too. If you want to create a ranking system, then you also need to prepare for results based categories.

Oh, and I don’t like the chess analogies when speaking of high rank as a goal in itself. They come out wrong, it’s not like that in chess. I have played chess at club level in my youth. The rank is never the goal. Well, you might e.g. want to climb to Grand Master. But that’s not really a rank. It’s a division between ranks in a results based system/categorization. You don’t get to play grand master events unless you are one, i.e. your rank is at the grand master minimum or higher. You don’t get to race in elite Zwift events unless you are elite yourself (well, sometimes you do, focus on the analogy please). Or put it in future tense: You don’t get to race in elite events if you haven’t reached an elite rank.

The purpose in chess is to beat other players, period. In a chess tournament you will typically be matched against somewhat equal opposition, but there is always a span, just like there would be in a Zwift category defined by rank somehow (fixed cat limits or dynamic?) You play a guy with equal rank and win and you only gain in rank a little, he loses a little. You play a guy with lower rank and win, you gain very little in rank and he loses very little. You play a guy with higher rank and win, you gain a lot and he loses a lot.

Sure, your ELO (if it’s any good in the context) is bragging rights, vague and general rights, but you don’t play for rank. You aim to take names. You want scalps. “I beat so-and-so last Saturday.” Your rank is just an anonymized testimony of your progress (the scalps on your belt) and your entry ticket to the next big boys’ club. It’s only that.

And it shouldn’t be any different in Zwift, although you’re up against masses in cycling so it becomes less personal.

5 Likes

‘’ Sure, your ELO (if it’s any good in the context) is bragging rights, vague and general rights, but you don’t play for rank.

And when you travel somewhere to play chess in a tournament I assume they don’t make you start at the bottom against people with inferior ratings. Presumably they have competition tiers and you start in one that is appropriate to your rank so that you have a reasonable expectation of an interesting game and the possibility of moving your rank up or down.

Zwift racing is like traveling for IRL events, you are racing against are large number of other people many of which you have never raced against. To get the best race you want everyone in the race to be at a similar rank. Having people in the race with a very dissimilar rank is like playing chess against someone with a vastly different rank. Probably not much fun for either party.

1 Like

@stuart_lynne of course.

It works like this in chess iirc:

Kid Timmy goes to chess club for the first time. Timmy is put in “chess school” and this goes on for a while with a focus on having fun to encourage him to stay with chess and the club. This chess school is something of a standard curriculum with first the basics of the basics, then some basic opening theory, end game theory etc. When Timmy is ready to start playing ranked games against other kids, leader gives Timmy a start rank, typically ELO 1000 or, if it’s an older kid that has shown a modicum of prowess, ELO 1200. It’s very inexact but probably a relatively suitable start rank at first. For a while at least.

Then Grigori joins the club too. Chess reminds him of good times at the thermal baths and the cafés with friends in the old country. Shrewd but with less growth potential than Timmy, Grigori is the MAMIL of chess, only in tweed rather than lycra. MAMIT? Anyway, he is given the ELO 1200 too after his initiation, which turns into more of a certification than a chess playschool for him. Grigori is then ready to start playing ranked games. It may well be that he quickly rises to something like the ELO 1400-1500 range.

If you yourself is in that rough range, then you may well become one of Grigori’s scalps in the next internal club tournament. You and others will lose rank a little for him to climb from his start rank to his actual level. Every single game where he climbs to his real level is a game where someone loses rank a bit. Yeah, it sucks, but it’s how they do it in chess and no one gets preferential treatment, it’s the same for all. They are always prudent with start rank and would rather have you climb on your own merits (results) instead, even though this happens at the expense of someone else.

And this is why half a year ago I used to say I liked AutoKitten. It wasn’t very good but it was good enough. The test model is same same but different. It could be used for the same purpose. Rather than a prudent start rank that is going to create a steady stream of new and upward moving subscribers passing through your category, taking your podiums in just about every race, you provide a rough guesstimate instead. It should create less turbulence and overall less frustration although such a start rank will never fully predict results. And we shouldn’t even try to predict results based on “external measures”, they should all be intrinsic, i.e. results based, except this one time.

Once the starting rank is given you drop all newtonian performance measures and never ever use them again. Not until the next recalibration period, if you go with those. Not inside ranked racing. Then you can have casual races on the side using performance data and som fun-and-games races too, the kick the can and the hide and seek of Zwift, it’s all good. But not where rank is used. Not ever again. Ever.

And when people race ranked races, they should be put in pens with people of relatively similar ranks or it won’t be fun or productive. You can have fixed boundaries between pens (like “ELO 1800-1900”) or you can have dynamic pens (split all signups into four pens based on rank).

Upsides to fixed boundaries are they give predictability for people signing up and also some quality control for both Zwift and subscribers because you won’t ever be matched against someone with a far higher rank. If attendance is too low (some arbitrary but tested attendance limit), then the race can still run but should no longer affect rank, and racers should be informed of this at start at the latest so they can go do something else if they prefer. But the downside to fixed boundaries is you need to be damn sure of the integrity of your ranking system. If e.g. there is inflation in your system, then the boundaries will become obsolete and dysfunctional with time.

An upside to dynamic boundaries is it’s so flexible and given decent attendance you can provide all racers with a decent racing experience against decently equal opposition. You could still have the minimum attendance check and check for spread in rank in each of the pens that get calculated on the fly before each race. If attendance is too low, then no ranked race, just a casual race. If spread within a pen is too wide (you’d have to test and figure out how wide is still enjoyable), then no ranked race, just a casual race.

Most competitive online games lean towards dynamic boundaries but can have some elements of fixed limits in them. Compare to Grand Master level in chess. You might e.g. have to get to an “elite limit” to be able to join certain events, as quality control on all levels for said events. But if such an event draws a big crowd of racers, then you divide them into pens. And you never ever run parallel pens, like two pens with the same ranking range just because there were thousands of signups. You always, every time, use rank to make the cutoffs.

So if in a huge event, The Annual Global Ultra Series of Zwift 2023™, there are among the thousands of signups 500 guys with a rank of 1371, then you may want to split those in two and all racers in those two will have the same rank but it’s not actually two parallel pens. You divided them in the middle but with all having the same rank you made the cut based on surname or something obviously redundant instead (i.e. you DO NOT use something like weight for backup cutoff).

EDIT: Another upside to dynamic pen limits is you can start to take control over field size. Is it enjoyable to race with 10 other people? Maybe it is, maybe it isn’t. Is it enjoyable to race with 5000 other people? Maybe, maybe not. But if you want to promote steering you will want to control field size or steering will become meaningless. And since rank distribution will be somewhat bell shaped, with dynamic pens you can create more pens in the middle, the fat belly of the bell curve, and thus control field size. No longer the fat C and B pens and half-empty D and A pens.

2 Likes