Distribution of Zwifters by 20min w/kg

Inspired by a question in the Too fast for D Grade at age 74 thread I thought I’d plot the w/kg distribution for Zwifters.

First, the power data. Ideally you’d trawl ZwiftPower and extract the FTP for every single rider but as that’s not practical I did the next best thing, looked up the results for the L’Etape Du Tour Stage 3 from Jul 2020 ZwiftPower - Login

As you can see, that stage was up Ven Top so it ensured that most people would be doing a steady effort of between 1 and 3 hours depending on ability with very little freewheeling or drafting. Also, 1,616 finishers meant a useful set of data. As this was also a popular Zwift event it would probably generate a more representative distribution of the Zwift population rather than just those who race.

As the time taken for the event would be so different for everyone I settled on comparing the 20 minute best power for everyone rather than the average power for the whole ride. Some people might have been racing, some might have taken it easy knowing they had 3 hours ahead of them.

There are limits to the kinds of conclusions you can draw from this data set so do not treat these figures as absolute facts. Full access to Zwiftpower would be needed if you wanted more accurate data.

With those disclaimers out of the way onto the distribution plot. 20 minute w/kg along the X axis, numbers on the Y axis.

Notice how 2.6 and 3.3 w/kg spike over the lower values and the dip at 4.2w/kg. Interesting coincidence how those values just happen to be near the C, B and A cat limits. Hmmm.

If we adjust the w/kg by mulitplying by 0.95% to get a ZwiftPower style FTP and then calculate the numbers that would fall within A, B ,C and D cats we get this:

A cat: 225
B cat: 688
C cat: 463
D cat: 214

Again, some interesting distributions. It shows that being the top of C cat would mean you are only in the top 58% of all Zwifters.

Here is the raw data if you’d like to do your own analysis:

w/kg number
1.4 2
1.5 3
1.6 6
1.7 8
1.8 14
1.9 15
2 33
2.1 32
2.2 51
2.3 50
2.4 59
2.5 56
2.6 73
2.7 81
2.8 102
2.9 92
3 92
3.1 87
3.2 78
3.3 107
3.4 90
3.5 77
3.6 56
3.7 58
3.8 43
3.9 43
4 35
4.1 37
4.2 14
4.3 20
4.4 22
4.5 13
4.6 0
4.7 5
4.8 8
4.9 6
5 9
5.1 2
5.2 6
5.3 1
5.4 1
5.5 2
5.6 0
5.7 1
5.8 0
5.9 0

As much as I would like to look under the covers of CE power curve rules/limits, this really validates Zwift’s view to not share the details.

Nice one Aoi.

2 Likes

Interesting rundown. Realize that there’s some degree of self-selection bias going on in all likelihood, as I would guess that there are proportionately fewer D riders willing or able to enter an event that could take 3+ hours on a trainer to complete.

2 Likes

Very interesting. Well done.

According to Coggan, having an FTP of 3.2 W/kg (top of C) would only classify you as having “moderate” power output in the grand scheme of things whilst anything below 2.5 W/kg he classifies as “untrained/non-racer”.

1 Like

keep in mind he is comparing those numbers to professional athletes / national or world championship level (in which case, yea, 3.2 is moderate cos it’s about half what the pros can do :D)

1 Like

Hahaha!

Not your intention/focus but to me the study is actually another proof of the presence of cruising, one that I hadn’t thought of myself. Well done!

Now, this wasn’t a race, but back in 2020, long before CE, there were lots of people “grooming their W/kg” in events like this since it would still affect their ZP categorization. I know you get this, but others might not.

I think I’m gonna steal your method and sample CE races and compare to pre-CE races, run statistical tests for normality and see what gives. I’m guessing your sample, a single event as it may be, might actually be large enough in itself. It doesn’t look normal to me (in the statistical sense… or actually any sense).

cool. i’m going to do some training

1 Like

If you’re looking for a good event Andreas then maybe one of the Alpe du Zwift events? They had a few thousand riders and as the time would be a bit shorter you’d get a better representation of a one hour effort.

I only used the Zen-top climb as it was in my history and easy to find for a quick and dirty comparison.

I did kind of expect to see statistical abnormalities around the category limits. Either it’s people deliberately not training so they don’t get bumped up into the next category (the biggest flaw of using power as a category limit in my opinion, it actively discourages people from improving which is the antithesis of a training platform) or people adding a few kg here and there to keep their w/kg under the cat limits. I wouldn’t be surprised if a distribution based on raw watts rather w/kg provided a more even bell curve.

Very good idea and interesting stats.
No one probably wants to change their category whether the race setting has categories or not.
It is a bad effect of the category system.
So I made a suggestion in another thread to ditch the category system.
I was harassed.

your post is great.

@Aoi_Niigaki Very good points!

Yes, that would be interesting, to make a comparison between distributions of W/kg and raw W on the one hand, and between pre and post CE on the other, test for normality and see if any of the tests fail. Also add a look at weight distribution. And, with a sample of several comparable events, divide by cat for the sake of sample size and statistical power.

(As a sidenote, I had a hard time finding CE events yesterday, even among the ZHQ ones. A passing coincidence?)

Ideally, one would want to look at pure races, for example the bazillion crits in the schedule, highly comparable. But then there is a methodological problem. Races are problematic. Someone could always argue any deviations from normal distributions was caused by low participation among those who don’t stand a chance to stay in the front groups. And there is probably some truth in that.

But then again, if the distributions show deviances similar to your histogram, where there are discrepancies between cat limit performance and performance very close to limit (eg between 3.3 W/kg and 3.1-2 W/kg, then I think that argument fails to provide a satisfying explanation.

Note: I have my windmills to fight and I do look for them. However, I struggle to come up with other natural explanations in case patterns like in the histogram turn out to be common. If anyone has ideas, let’s discuss them. Before any study. So we can agree to accept the results, whichever they may be.

I’m really eager to have a look at this. Unless someone beats me to it, I’ll get to it when I can find the time and… how shall we put it… gather some data. I’m a bit wary of discussing details since they tend to get censored around here - this is not an open discussion, even though they would probably point at subscriber integrity and data protection regulations if confronted. And I haven’t. Gotta pick the fights.

Events > rules > category enforcement

@k.kanai There’s already that other thread, so we shouldn’t derail. I just felt I wanted to quickly comment on your ideas from over there. Before turning back on topic again.

I wouldn’t say you were shot down. There are frequent/high profile posters who have been speaking for the idea of removing cats since forever. So you’re not alone. But then there are others, like me, who kinda like having the cats, only perhaps not the cats of today.

I’m from a country where road cycling as a sport is very weak and fragile, even more so today although the number of recreational road cyclists has increased dramatically during later years. But road cycling as a phenomenon over here is completely dominated by overpaid male middle-age middle managers trying to overcompensate their (our) fear of aging.

But let’s look at it from a youth/junior perspective. I like to do that. Zwift probably isn’t the go-to platform for the young breaking into The Beautiful Sport, cycling. But it could be. And for them, having a chance to develop skill, fitness and knowledge without getting obliterated by late 20’s to early 30’s cat A type riders everywhere they go in Zwift, to let them do their own thing on their own terms, so they can grow to become the ones who obliterates the middle managers, is what I want to see. And cats will help there and make things more fun. And no, age cats don’t necessarily cut it, because people develop at different speeds and during different periods. Separating the ”mediocre” from the elite can promote growth better than an all-out performance Darwinism, that is my belief.

That said, cats will always be tricky to some extent. And there is nothing inherently unfair about removing cats. I do agree that it would solve many problems. It would insta-kill sandbagging and cruising and leave us with a perfectly nice and smooth bell-shaped curve when it comes to distribution of power or weight or whatever physical measures you throw into it. I just don’t like the price myself.

1 Like

Sorry for deviating from the content. Please let me just say thank you.
Thanks for your great opinion.

1 Like

In the real world where egos aren’t crushed on a screen, I would generally agree but I believe the zwift categorization is skewed to being too easy.

I’m sure this happens, but the effect is probably combined with another one: people who don’t game the system, they move into the bottom of the next category, they become demoralized because they went from winning a lot to winning never, and they enter fewer races. It seems like the cause is essentially the same, but I can’t guess how much of the effect is from cheating vs giving up. A results based system would help, but I’m starting to think that the large category ranges are at least as much of a problem. Improved category enforcement can’t entirely fix that.

1 Like

And organiser-configurable category boundaries would help too.

1 Like

I think you are being to nice to your fellow zwifters… My opinion is the vast majority in those cusp areas are holding back to not get promoted.

Look at any post on either of the prominent Facebook group about cats & upgrades and the vast majority of responses are about managing power to stay in the sweet spot of not getting promoted and being ‘competitive’. ZRL then adds to this issue… it’s been allowed to become the norm.

My point is that neither of us knows how significant either of these factors is. It’s still important to question the conclusions we draw from the data. There may be more than one explanation, and both could contribute to what we see in the data.

Zwift possesses data that could put those explanations in perspective. They could calculate the relationship between category upgrades on either rider weight increases or on race participation rates.

1 Like

Certainly, it’s very rarely a single factor, I just think it’s very much weighted towards those managing their output…
It kind of reinforces my perception that comes from social media, forums, racing ZRL, zwift in general over the years that the culture has become upgrades are a punishment and should be avoided when it comes to racing.