Hi Flint,
Great to see the first steps taken in the form of these test events. Obviously the actual post-race evaluation will need to wait until, well, after the event. However, I thought I’d already share a constructive pre-race evaluation. There is much to like about the directions you are taking, but also room for improvements as I try to elaborate below.
-
Category Enforcement; the technical side of it seems to work (for me). I have a minimum enforced category and can still choose to ride up if I want to. Thanks for including this option already from the start.
-
Allocating a Zwift Category; according to your post you are now looking at all Zwift data - besides workouts - which is a great improvement IMO. It brings the category definition back to FTP/weight rather than race data/weight, which has been a flaw in ZP suffering from rapid inflation. It also allows for the allocation of as many riders as possible, including those that have not participated in events yet. Checkbox ticked.
-
Changing weight shortly pre-race to circumvent enforcement; I haven’t gone into great length as I only tried this via companion, but I can confirm this did not allow me to ride a category down eg ‘sandbag’. I don’t know what would happen if I put in a couple of rides at a higher weight, but I’m not willing to try this either. So another checkbox ticked.
-
Use as a toggle option for events; thanks for putting this high on the list of priorities.
These are the bricks of the house - to use your previous analogy - so I just wanted to appreciate these specifically .
Then back to the important question you postulated earlier, does racing (or the category assignment) feel fair?
This question can swing many ways depending on its interpretation. I can only speak for myself: I view my racing as a thrilling and engaging workout with the aim of getting stronger. If Zwift racing doesn’t meet these criteria then I will not participate in it.
There are two main problems with the traditional ZP bands: riders ‘hiding’ their actual capabilities, and matched riders getting upgraded at a different point depending on their weight.
The inclusion of all ride data partly addresses the first point, and I assume that the intention of including MAP and VO2max prediction is to add some layers on top of this. But this adds two new variables. Not only that, but it also starts looking at absolute Watts rather than W/kg, which you could say is third new variable. I do wonder if this is the best choice, especially since these short aerobic intervals would primarily be required on climbs. I honestly don’t know much about MAP or VO2max estimations’ relation to eFTP/CP based on short intervals. I do know that the CP I’m getting from GoldenCheetah is pretty accurate also when I ride Z2-3 and only put in one max-effort of 5-8 minutes. Something to think about I would say.
The risk with shifting from W/kg to just Watts for MAP, is that you are basically using an additional and different power-curve to fit riders. This means there is a new variable that may split riders of similar abilities (and again depends largely on weight). I think there are already good examples in this thread demonstrating where this goes wrong, but for further illustration: 360W or 5.4 W/kg simply means a 65-80-95kg rider is limited to 5,4-4,5-3,8 W/kg, respectively, before they are upgraded to ‘A’ by this metric. If I try to depict a race on say, London loop (Box Hill), then unfortunately I don’t think this will provide ‘fair’, thrilling or engaging racing. Another empiric observation from TTTs: I can draft behind a bigger rider putting out 360W perfectly fine as a B. If that same 360W-cap applies going up then this rider is limited to 3,8-4 W/kg. Now I don’t know which duration falls under this limit, but these are 20-min B-category levels. In other words: that rider cannot drop a mid-B rider and is not allowed to follow that same mid-B rider going up.
The equivalent questions can be asked for the relation between VO2max and in-game speed. It’s a pandora’s box opened and the solution may very well be worse than the problem it was intending to fix.
So the question is how to go forward. I really want your test events to succeed, so that you ultimately can offer better racing to everyone. You could wait for the initial feedback after each event on the different courses, but I worry that a strong-C will not be tempted to even start in an A-pack. It would be daunting already for me as a mid-B. So the risk is that feedback from these riders gets ‘lost’, with no race results to prove their point.
What I would strongly consider is to change to an iterative approach. For week one, just see how the ‘technical’ side works. Provide non-ZRL racers with the experience of cat-enforcement, based on ZP categories but including all Zwift ride data. So Zwift-FTP/weight(kg), without the inclusion of MAP or VO2max criteria. This will be your ‘benchmark’. Then you could run your MAP and VO2max calculations on the riders that finished these test events, with the goal to see which riders you would have upgraded with either variable, and if this makes sense with how they performed. You could do the same already for existing ZRL-data, but this will not be an ‘average Zwifters’ population. If you find which/that your new metrics make sense, then you could add one layer on top for categorization in week 2 and repeat. Then riders can give feedback on if they found their race experience was improved or not.
What I can’t stress enough: there is a risk to end up in an endless process of fiddling metrics. IMO the biggest flaw to solve is to estimate each riders’ FTP more accurately to help the initial seeding of riders. Ideally based on all rides (now ticked) and on a larger part of the power-curve (eg a strong 18min effort could also bump your FTP and thereby category). One could try to iterate in a way to “normalize” this FTP/weight for different weights and heights based on the CdA calculations, but I don’t think it’s worth it to go further than that. From there, the easiest and most fair way to upgrade riders is to implement a ranking system. Those that perform well can improve their fitness in the next category.
Full disclosure: I am a lightweight, average height, mid-B rider. The test event places me in ‘B’, which feels right for me. While racing I generally need to put out .2-.3 w/kg more than a mixed-group average to stay in the bunch. I am currently at a high sweet-spot to stay in the front-bunch of a B-peloton and at some point get dropped by a critical course feature. I would have a tough job beating some cruiser-Cs, but I definitely would not enjoy racing in the C-category, nor A-category. If I ever do well in a B-cat race, I would want it to be because I beat similar/matched riders. Not because they were put in a different category. Though I am currently at the ‘wrong’ side of the ZP categories, I don’t want ‘the other side of the spectrum’ to get the similar poor race experience. It is demoralizing to get upgraded before even being allowed to be competitive.