@stuart_lynne of course.
It works like this in chess iirc:
Kid Timmy goes to chess club for the first time. Timmy is put in “chess school” and this goes on for a while with a focus on having fun to encourage him to stay with chess and the club. This chess school is something of a standard curriculum with first the basics of the basics, then some basic opening theory, end game theory etc. When Timmy is ready to start playing ranked games against other kids, leader gives Timmy a start rank, typically ELO 1000 or, if it’s an older kid that has shown a modicum of prowess, ELO 1200. It’s very inexact but probably a relatively suitable start rank at first. For a while at least.
Then Grigori joins the club too. Chess reminds him of good times at the thermal baths and the cafés with friends in the old country. Shrewd but with less growth potential than Timmy, Grigori is the MAMIL of chess, only in tweed rather than lycra. MAMIT? Anyway, he is given the ELO 1200 too after his initiation, which turns into more of a certification than a chess playschool for him. Grigori is then ready to start playing ranked games. It may well be that he quickly rises to something like the ELO 1400-1500 range.
If you yourself is in that rough range, then you may well become one of Grigori’s scalps in the next internal club tournament. You and others will lose rank a little for him to climb from his start rank to his actual level. Every single game where he climbs to his real level is a game where someone loses rank a bit. Yeah, it sucks, but it’s how they do it in chess and no one gets preferential treatment, it’s the same for all. They are always prudent with start rank and would rather have you climb on your own merits (results) instead, even though this happens at the expense of someone else.
And this is why half a year ago I used to say I liked AutoKitten. It wasn’t very good but it was good enough. The test model is same same but different. It could be used for the same purpose. Rather than a prudent start rank that is going to create a steady stream of new and upward moving subscribers passing through your category, taking your podiums in just about every race, you provide a rough guesstimate instead. It should create less turbulence and overall less frustration although such a start rank will never fully predict results. And we shouldn’t even try to predict results based on “external measures”, they should all be intrinsic, i.e. results based, except this one time.
Once the starting rank is given you drop all newtonian performance measures and never ever use them again. Not until the next recalibration period, if you go with those. Not inside ranked racing. Then you can have casual races on the side using performance data and som fun-and-games races too, the kick the can and the hide and seek of Zwift, it’s all good. But not where rank is used. Not ever again. Ever.
And when people race ranked races, they should be put in pens with people of relatively similar ranks or it won’t be fun or productive. You can have fixed boundaries between pens (like “ELO 1800-1900”) or you can have dynamic pens (split all signups into four pens based on rank).
Upsides to fixed boundaries are they give predictability for people signing up and also some quality control for both Zwift and subscribers because you won’t ever be matched against someone with a far higher rank. If attendance is too low (some arbitrary but tested attendance limit), then the race can still run but should no longer affect rank, and racers should be informed of this at start at the latest so they can go do something else if they prefer. But the downside to fixed boundaries is you need to be damn sure of the integrity of your ranking system. If e.g. there is inflation in your system, then the boundaries will become obsolete and dysfunctional with time.
An upside to dynamic boundaries is it’s so flexible and given decent attendance you can provide all racers with a decent racing experience against decently equal opposition. You could still have the minimum attendance check and check for spread in rank in each of the pens that get calculated on the fly before each race. If attendance is too low, then no ranked race, just a casual race. If spread within a pen is too wide (you’d have to test and figure out how wide is still enjoyable), then no ranked race, just a casual race.
Most competitive online games lean towards dynamic boundaries but can have some elements of fixed limits in them. Compare to Grand Master level in chess. You might e.g. have to get to an “elite limit” to be able to join certain events, as quality control on all levels for said events. But if such an event draws a big crowd of racers, then you divide them into pens. And you never ever run parallel pens, like two pens with the same ranking range just because there were thousands of signups. You always, every time, use rank to make the cutoffs.
So if in a huge event, The Annual Global Ultra Series of Zwift 2023™, there are among the thousands of signups 500 guys with a rank of 1371, then you may want to split those in two and all racers in those two will have the same rank but it’s not actually two parallel pens. You divided them in the middle but with all having the same rank you made the cut based on surname or something obviously redundant instead (i.e. you DO NOT use something like weight for backup cutoff).
EDIT: Another upside to dynamic pen limits is you can start to take control over field size. Is it enjoyable to race with 10 other people? Maybe it is, maybe it isn’t. Is it enjoyable to race with 5000 other people? Maybe, maybe not. But if you want to promote steering you will want to control field size or steering will become meaningless. And since rank distribution will be somewhat bell shaped, with dynamic pens you can create more pens in the middle, the fat belly of the bell curve, and thus control field size. No longer the fat C and B pens and half-empty D and A pens.