This is a bit long and maybe a bit technical but I think there’s serious issues with the W/kg metric used for ranking racers. But I like this sort of thing, and it seems to me there is room for a lot of improvement in what’s done presently, and computers are good at calculating complicated things, so why stick with overly simplistic formulas which don’t work well.
problems with present approach
One is the use of 20 minute power. Racing requires power over all durations, not just 20 minutes. 20 minutes is on the long end of efforts required to be competitive. Consider two riders, both with the same 20 minute power, but the other with a superior 5 minute and 30 second power. The first rider won’t really have a chance. However, if one rider has better 5 minute and 30 second power, but the other has a better 20 minute power, then the race gets interesting: it becomes a matter of tactics.
Another is the use of power per mass. Power per mass is a good metric up L’Alpe de Zwift, which is sustained steep, but on Fuego flats, it’s more power squared per mass, or similarly, power per square root of mass. So if you have two riders matched on power per mass, but one is heavier, the heavier rider will almost always have an advantage. IRL lighter riders are typically better climbers and worse sprinters, but in Zwift categories, the lighter riders are no better at climbing and worse at everything else.
So we can correct these two issues.
use flat speed and VAM, not W/kg
We know actual racing consists of a mix of climbing, descending, and flats. Fortunately, Zwift knows how to calculate speed (IRL this involves coefficients we don’t know). So what Zwift could do is, for a given power number (and mass and height), assuming a “standard bike” (Zwift frame + wheels), on smooth pavement, calculate flat speed and climbing VAM (VAM is the rate of vertical ascent on steep climbs).
speed rating = sqrt [ flat speed ⨉ VAM ]
With this formula, if you have two riders, one climbs 10% faster but is 10% slower on the flats, they will be in the same category, and that will make for a more interesting race than W/kg, in which they will climb the same but the heavier, more powerful rider will always be better on the flats.
what to use for “power”?
Now we need a power number. Presently 20 minute power is typically used (ramp test is 1 minute power). The problem here is that a rider may produce a really good 19 minute power, but then stop. This will result in a 5% lower 20 min power, or perhaps no 20 minute power at all. Or a rider may climb L’Alpe de Zwift and get a really good 50 minute power, but not as good a 20 minute power. The additional issue is that we want to include all powers over a range: 10 second, 3 minute, 20 minute, 60 minute in calculating race ability, since all contribute.
cleaning up maximal power curve
So we have a maximal power curve, which is the best effort for every duration from 1 second to 3600 seconds (for example). But we need to clean it up first. To clean it up, I’ll make two assumptions:
 if I can produce P average power for seconds 1 to t, then for second t+1, I can produce at least 2P/3.
 If I can produce P average power for t seconds, then I can produce P for any duration < t (sometimes maximal power curves have bumps due to interval efforts, when the duration encounters a second interval).
So this can be fixed:

Step thru the curve from 1 to 3600 seconds. For each duration, if the maximal work (power times time) calculate a lower bound maximal work for the next duration (assume less than this is due to submaximal effort by the rider for this duration). The lower bound maximal work for duration t + 1 = the maximal work for duration t multiplied by (t + 2/3) / t. So unless the actual maximal work for this duration, from rider data, is more than this, then use this for the maximal work for duration t + 1. From maximal work, maximal power = work / duration. So riders are always assumed to be able to produce 2/3 the average power they sustained so far for one extra second. Then 2/3 of that new average power for the next second, etc. If rider data has a better result than this, we use rider data instead, and that rider sets the lower bound for the next second.

Step thru the curve from 3600 seconds to 1 second. If the maximal power ever drops, don’t allow it to drop. Note I’m going in reverse duration direction. If you can produce a power for longer times you can produce the same power for shorter times.
With this algorithm, it is assumed that riders can hold at least 70% of their 20 minute power for an hour, which I feel is a safe lower bound even for extremely endurancechallenged riders. This is determined by the 2/3 coefficient, which could be increased if this is too conservative.
calculating effective power from maximal power curve
Now I have a cleaned up maximal power curve from 1 to 3600 seconds. I need to calculate an effective power from the whole curve, not just one arbitrary duration on the curve.
The following formula is one suggestion:
Pavg = exp ( [ sum from 5 to 3600 { ln  P(t)  / ( t + 10 ) } ] / 5.51725 )
I added 10 seconds to each power in the denominator to slightly deweight very short powers in the sum.
BTW, I changed this from what I originally posted, since I think the original overweighted sprinting powers. By averaging the natural logarithm, then taking the exponent at the end, it’s fractional differences in power which matter, not differences. So if one rider has twice the power at the sprint end, and another has twice at the endurance end, those will cancel out, rather than the larger difference in absolute watts in at the sprint end dominating. I’m assuming time here is in seconds in 1second increments, and power is watts.
Summary:

don’t divide by mass: calculate a flat speed and a climbing VAM from power and mass and height, assuming a standard bike and wheels, and take the geometric mean.

calculate a “corrected” maximal power curve, to compensate for the fact there aren’t quality efforts at every duration.

from the maximal power curve, calculate an “average” power for the curve, which is applied to the speed formula.