Monday, June 13, 2022

Mapping NTRP to WTN - Men's Doubles

The USTA has started publishing the new World Tennis Number (WTN) for players on their profile on usta.com.  I did some initial analysis last week, but now I'm able to do a bit more thorough and complete analysis so this begins a sequence of posts with that.

My process for this analysis is to look at a sampling of the USTA League players who received a 2021 year-end rating/level and look at the distribution of WTN levels for each different gender, discipline, and level.

By limiting it to players that received a 2021 year-end level I am only considering players who got a recent NTRP rating so that I'm not using stale NTRP levels in the comparison.  Since NTRP levels don't change again until the end of 2022, a player could have gotten better or worse in the 6 months since ratings were published, but this seems like a reasonably consistent approach to doing this mapping.  Ideally this would be done using WTN ratings at the time the NTRP ratings were published, but we haven't had WTN until now so we do the best we can.

In this post, I'm looking at WTN's for men's doubles.  What you will see in the chart below is the count of players at each WTN level in total and broken out by NTRP level.  So let's have it!

(You likely want to click on the image to see a larger version of it).

Looking at the general distribution, we see a fairly typical normal distribution to the left of WTN 22, but to the right, things aren't quite as prototypical.  22 itself is more of a peak than expected, with a quick drop to 23 and 24 before a normal gradual slope, then a quick drop again, and then it is missing the long tail we expect from a normal distribution.  It is certainly possible this is the skill level profile of male doubles players, and it could be that the players that would be in the long tail aren't playing competitive matches and so aren't in the WTN system.

There is an odd blip of men at WTN 3, including some 4.5s, more at WTN 3 than at any of WTN 4 thru 8.  That seems a bit odd, both the 4.5s/5.0s and the high (relatively) number.

Comparing this to the women's chart, we see there is a different in both the WTN with the largest number, 22 for the men and 25 for the women, but also where the women had virtually no one at 10 or better, the men have a fair number including the blip at 3.

But looking at the counts by NTRP level is what is perhaps more interesting and will give us an idea of how NTRP maps to WTN.

As you'd expect there is for the most part the expected normal curve within each level, that is until you get to 3.0 which has some ups and down across WTNs that causes the oddity noted above.  What is perhaps not expected is how wide each NTRP level is.

There are a noticeable number of 4.5s ranging from 3 to 27 (2 to 31 for all) with an average of 17.2 and standard deviation of 4.9.  If there was a direct and perfect conversion from NTRP to WTN, one NTRP level would correlate with about 4 WTN levels, so having a range of 25 (!!!!) levels, with some players even beyond that, seems very high.

Similarly, there are a reasonable number of 4.0s from 10 to 30 (3 to 34 for all) with an average of 21.4 and standard deviation of 4.09.

The explanation of course is that NTRP and WTN use different algorithms and there is no direct mapping.  And WTN could argue that their algorithm is better and these large ranges are actually correct and more accurate, but that would be indicting NTRP as not being accurate then.

One key algorithm difference is that WTN calculates separate singles and doubles ratings, and that can certainly be the reason for some variance, e.g. someone could be good at doubles and bad at singles but play both.  Their NTRP will be somewhere in the middle but WTN will have the individual ratings farther apart.

Additionally, what matches the USTA includes for WTN has not, to my knowledge, been shared yet.  If WTN includes matches that NTRP does not, that could be contributor to differences between the two systems.

Back on the genders though, comparing the average WTN for males and females, we get:

  • 3.0 - women 28.2 vs men 28.4
  • 3.5 - women 24.8 vs men 24.6
  • 4.0 - women 21.9 vs men 21.4
  • 4.5 - women 18.7 vs men 17.2

There is not a significant difference between these at most levels, then men being a few hundredths better at 3.5 and 4.0, and 1.5 better at 4.5, but surprisingly the average 3.0 woman rates better than the average 3.0 man!  Something seems amiss here as there is a general consensus that on the NTRP scale the male and female of the same level, the male is about 0.5 NTRP better.  That would translate to about 2 WTN levels and we see the difference at 4.5 approaching that, the other levels it does not.  I'm not sure the gender neutrality part of WTN is fully working yet.

The other thing you can do with this chart is see how many NTRP levels there are at a given WTN level.  For example from WTN 20 to 27 there are 4 or perhaps even 5 noticeable segment of players from 2.5/3.0 to 4.5.  If WTN were to be right, it would be saying the 3.0s and a 4.5s with WTN 25 would have a competitive match.  That seems hard to believe, but perhaps there really are some edge cases where a 4.5 has a rating that is lagging their ability and the WTN algorithm, despite being updated weekly, hasn't caught up with the 3.0's improvement.  But this seems a bit of a stretch.

Which is right?  WTN or NTRP?  I can't say at this point, but stay tuned, I'll keep doing analysis.  And it is likely that neither is "right" or "wrong" and they are just different.

See the prior post on Women's Doubles, and stay tuned for this same post for the Women's Singles, and Men's Singles.

Update: I did an update that only looked at high confidence WTN ratings here.

2 comments:

  1. When I see ranges that big, it tells me there is a very low correlation between NTRP and WTN levels. If they were highly correlated, we'd expect those graphs to have bigger spikes with little overlap. Have you done similar work comparing NTRP's and UTR's?

    ReplyDelete
  2. Yea, something seems way off with these ratings. And sample sizes have to big enough for each player to begin with any rating system including NTRP, which often isn't happening. The main thing I'm sure a lot of people are wondering is what is going to happen with this WTN, if anything?

    From my experience, male/female ratings are closer to 1.5 levels difference for NTRP. For example, a mid 4.5 female would roughly equate to a borderline 3.5/4.0 male. The 3.5 guy is usually better than the 4.5 girl for 8.0 mixed combos from what I've seen, but not always.

    ReplyDelete