Update on results based ranking, Zwift score etc. etc. etc.?

You finish top or bottom 25% of any race series or individual race, however it is set up, and you get category promoted or relegated. Have some fun with it.

i like that idea, but it’s terrible. lol

Is there an issue whereby your scoring is never going to be right while you populate pens by CE and score by the ‘new system’. The 2 system need to in step with each other.

You need people seeded by the scoring system to show their true elo / true skill score.

Zwiftracing.app has shown when a rider changes CE category their elo then becomes volatile as they re-balance in their new environment. Any launch metric will change when races start to seed using the new system. There is going to be volatility at the start regardless of how you slice and dice it.

It’s either start from ‘fresh’ with 5 min efforts as rankings, or you start with rankings based on the current CE which become obsolete and outdated as soon as you start seeding pens by the ‘new system’.

A bandage is going to be needed to be ripped off quickly whichever way you do it as there will be some pain and noise.

3 Likes

Could be carnage, I look forward to that

image

1 Like

3 minute power is also important to be included as many short climbs on Zwift are completed for many under 5 minutes therefore only using 5 minutes will not show their full potential.

2 Likes

It’s not counted for ce calculation so doubt it would be used for their scoring system

I think if you tailor the solution to the environment that’s probably about right - I just get the feeling 3 min would be to short term and cause to much noise from certain parts of the ‘community’

The more data points that’s used the better to stop sandbaggers better not to publish all parameters on how riders are graded either or riders can manipulate power to stay within parameters. A rider could push 400 watts for 3 minutes then reduce to 200 watts for 2 minutes after they’ve completed a climb for example this skews 5 minute power to 320 watts when it could be 350 watts.

1 Like

My general feeling on this is Zwift has an awful understanding of metrics, data, statistics, and probability. That you hide your zMAP and zFTP formulas is bad because CE is such BS right now. Racing is full gas in C for 12 minutes, it’s almost like we are back to the old days. You can’t have top category zMAP and zFTP without gaming the system, but that’s the case and it’s done by racing at 10-15 minutes of full gas at the start. I was a ZR.app skeptic, but the best and most competitive races I have ever done are ZR.app races. So instead of being a corporate troll for Zwift and start owning up to the community being way ahead of Zwift and Zwift taking anything the community does and ruining it. Your smugness is neither helpful nor appropriate.

4 Likes

some people want a results based ranking system. others want one based on power. my ideal category system is one that is designed to only put me in pens with people who have your exact personality

zrl does their best with division 1 but occasionally someone who isnt actual pond life finds their way into there by accident and its a real mood killer

1 Like

Any news on ZRS?

If I´m not mistaken, CE started in Q1 2022 …

Has been a loooooong wait for something better, for people that likes racing.

1 Like

Hi @xflintx is there any update or timeline on this?

2 Likes

From a development perspective, we’re nearing the finish line. We’ve been tinkering with formula to get scores feeling good enough to test and have been working on getting the final technical pieces in place. Developmentally complete doesn’t mean it’s immediately ready for release, so the next piece is to finalize the plan for public testing and determine a date to start, which we’re working through right now. So, still chugging along on getting all the right pieces in place to start getting Racing Score let out into the wild.

12 Likes

I spent some time catching up all the chatter about this project over the past ~6 months. I understand there is probably a clear path toward implementation at this point, but stepping back a bit and thinking big picture, I had some ideas that I haven’t seen come up before and I wonder if they are worthy of consideration?

Imagine fitting a multivariate distribution for power features — say 15 second, 1-, 5-, 20-minute, and average power — over a bunch of previous events on the same course. We want to use this distribution to rank current signups for an upcoming event relative to the distribution of previous performances on the same course.

Layperson explanation
The goal is to organize races where everyone has the opportunity to compete against others who are similarly skilled. To do this, we look at how cyclists have performed on the same course in the past. We’re particularly interested in how long they can maintain certain levels of effort for different lengths of time (like 15 seconds, 1 minute, 5 minutes, 20 minutes, and the whole event).

We take all this past performance data and create a map that shows us what typical performance looks like. We use this knowledge for new events, comparing entrants’ best efforts (i.e., 60 days) over these same intervals with our map. Depending on where they fit on this map, we can figure out who their closest competitors are.

With this mapping, we can put entrants into a group with riders who are most like them in overall ability.

We could also move from a fixed 5 or 6 categories to dynamic category sizes that depend on signups. While this would add some complexity to last minute signups, it would create a more dynamic environment where entrants may, by chance, land in the upper or lower part of the given category. This system could be something like no fewer than four categories, and no greater than 50 riders in each category.

This approach ensures that every cyclist gets to race in a group that’s about right for their level – not too easy, not too hard. Ideally it makes the race challenging but fair for everyone, giving each cyclist a good chance to compete and succeed.

Technical Explanation
In this approach, we would employ a multivariate statistical model to analyze the distribution of cyclists’ power outputs over multiple durations (15-second, 1-minute, 5-minute, 20-minute, and event average). The core idea is to fit a multivariate distribution to historical power output data from previous events on the same course. This distribution captures not only the marginal distributions of power outputs at each duration but also the interdependencies among them.

Once the distribution is established, we utilize it to define multivariate quantiles. These quantiles allow us to rank current competitors based on how their individual power profiles align with the historical data. A rider’s rank is determined by the quantile into which their power output vector falls, reflecting their relative performance across all measured intervals.

This ranking system is not just about where a rider stands in terms of a single power output metric, but rather it’s a comprehensive assessment of their overall power profile in relation to the historical performance landscape. By utilizing multivariate quantiles, we can categorize riders into performance-based groups more accurately, ensuring that each group consists of riders with similar overall power characteristics, despite potential variations in individual power output intervals.

Thoughts?

I think you had a few statistics lesson but lack a link to practice. In practice, the more variables you fit the better the fit. The worse prediction cq reallife outcomes become and that’s the only thing that matters.

Simpler model is always better. More intuitive and easier to understand the limitations. Which makes it much easier for people to work with, both as an organiser as well as a rider.

1 Like

The biggest issue is that power-based sorting is the wrong solution. Past results are a hard endpoint and are going to be a better predictor of future results than trying to cram a bunch of power datapoints into a black box and end up with a results prediction.

4 Likes

I think just as big of an issue is that if you still only have 4 cats, there’s always going to be a large spread of performance in each cat, and just as much whining about not being able to stay with the lead group, etc.

2 Likes

just one: delete the wikipedia citation notes out of your homework before you hand it in

5 Likes

No reason you have to have fixed categories, in fact that would encourage people to game the system. Floating ranges and matchmaking that places people in the appropriate pen at or close to when the flag drops would result in a better experience, assuming some mechanism for allowing friends to choose to race together exists.