Thursday, June 16, 2022

What does WTN do with professionals, and what is the Pro Zone?

The World Tennis Number (WTN) from the ITF is now being published on USTA player profiles, is advertised as a rating going from 40 for a beginner, to 1 for a pro.  But what exactly does that mean?  What are professional's WTNs?  And how can I relate a pro's WTN to your own?

First, from the various FAQs on WTN, there is the text "Pro players will be closer to 1".  This isn't explicit, but with the FAQ preceding this statement with the range being 40 to 1, one might think that the very top pro is a 1.0 and other pros would be in the 1.x or perhaps even 2.x range.

Second, in my perusal of WTNs, the smallest I've seen is 1.1, this for a male unrated (NTRP) player from Mississippi.  There are actually very few in the 1's, but the 2's has a fair number.  So if a pro is a 1, is this 1.1 on par with pros?

When I did my chart showing a hypothetical mapping from NTRP to WTN, I left a little room for pros (NTRP 7.0) to live in the 1-3 WTN range.  My guess at the mapping has been proven wrong with the ranges much larger with huge overlap, but if there is no room for the pros, what is done for them?

It turns out, the USTA at least, is not publishing WTNs for pros.  If you go lookup Jack Sock for example, you won't see a WTN showing for singles or doubles, but instead it has the meter maxed out and "Pro Zone" text in the middle of the widget.

It appears what happens is anyone that is an actual "pro", this based on having an ATP/WTA ranking, will simply be listed as a "Pro" and that appears to be what "Pro Zone" means.  This from an LTA web-site:

A world-wide rating system that ranges from 40 (recreational players) to 1 (pro players) - players with an ATP/WTA ranking will be listed as PRO

What I'm not sure of is if the matches professionals play are part of the WTN dataset or not.  You'd think they would be, in which case my guess is the WTNs could go below 1.0 and perhaps even negative.  Perhaps this is why they don't show them and instead just show "Pro Zone".

But what will happen when Jack quits playing and his ATP ranking goes away?  Shouldn't his WTN then become visible?  What will it be?

This exposes a bit of a challenge for any rating system where there are arbitrary upper and lower thresholds.  What happens if a 39.9 does so poorly that their calculated WTN goes above 40?  What happens when a player, recreational player or pro, wins so much their WTN goes below 1?  Does the whole range have to be recalibrated and everyone's WTN changes to keep the integrity of the 40 and 1 boundaries?

It appears the ITF (or USTA at least) is punting on the subject and just not showing pro's WTNs.  And beginners, at this point at least, aren't playing any matches that get into the system so we don't have to worry about that yet.

Although on this last point, I do see some NTRP 2.5s  with WTNs of 40.0, not many, but a few.  So it appears this boundary is enforced, at least from what is published.

And in another "Whacky WTN" moment, I found a male player who is a self-rated NTRP 5.5, and in singles they show as "Pro Zone", but in doubles they are a measly 22.3 which is the WTN you might see for an NTRP 3.5 or even 3.0 player!  Similarly I found a female player with no NTRP that is "Pro Zone" in singles but a WTN 17.4 in doubles.  And there are others that are Pro Zone in singles or doubles and a double digit WTN in the other.  And some of these wild differences have the high confidence blue checkmark as well.

I'm sure some of what I'm seeing are exceptions or edge cases, but I'm coming across a fair amount.  It is interesting to say the least.

Observations about WTN for USTA League players after two weeks of publishing

The USTA has now published the new World Tennis Number (WTN) for two weeks, so there is enough data, real and anecdotal, to begin making some observations.

As stated by the ITF/USTA, a player's WTN is supposed to be updated once a week on Wednesday and incorporate matches through the previous Sunday.  So far we've had the ratings published on Wednesdays so they are hitting that, but are they incorporating the matches they say there are?

That leads to the first observation which is it appears there may be a delay in matches being included.  I've seen several cases where players played matches between 6/5 (the previous Sunday) and 6/12 (the Sunday before the 6/15 publish) and all indications are those matches were not included in the calculations.  And yes, I checked, the matches were entered on time and prior to 6/12.

This may just be growing pains and it will get ironed out, but if a player's matches are not consistently included, it makes it difficult to follow one's WTN and derive any meaning or benefit from tracking it.

But on the other side, there are players that played no new matches, and their WTN changed from week one to week two.  This isn't right or wrong, but does perhaps give us some insight into the algorithm.  It is possible that the matches included actually changed due to catching up, despite none being played, and this would be another indication of some growing pains in the data collection, but it is perhaps more likely that the WTN algorithm simply allows for changes to a player's rating when they don't play due to prior opponents playing and their rating going up or down.

It is a classic debate with ratings algorithms on if one should only look at the rating of the opponent at the time of the match, or if their future results, and them getting better or worse, should have an effect on your rating.  As I understand it, the dynamic NTRP rating calculated throughout the year only looks  at the rating at the time of the match and your dynamic rating won't change if you don't play.  UTR on the other hand will have a player's rating change when they don't play (wildly so at times) and it appears WTN is following UTR's lead in this area.  I have not seen wild swings with WTN, but we only have one update's worth of data.

A big observation I perhaps should have led with is that in mapping NTRP to WTN, you end up with some extremely large ranges.  I showed this in my analysis of singles (men and women) and doubles (men and women) mappings, but the summary is that doubles has a wider range than singles, but even singles has one NTRP level mapping to 10+ WTN levels (doubles has 15+).  A side-effect of this is that some WTN levels have noticeable representation from 3-4 NTRP levels.  If WTN is to be believed, these players from the 3-4 NTRP levels would have a competitive match together.

Another observation is that some players have dramatically different singles and doubles WTNs.  Differences of 5 are not uncommon at all and there are some that are 10 or more different.  And from personal experience, some of the differences are way off and perhaps even backwards.

Note that WTN is arguably better than NTRP in that it calculates separate ratings for singles and doubles, and the majority of players are probably better at one than the other.  The question is if what the WTN is calculating reflects reality or not.  Perhaps it does overall but there are at least some exceptions that make you go hmmm.  But the separate ratings can also be an explanation for some, but not all, of these observations.

WTN has been positioned as a gender neutral rating, but from what we are seeing, in part due to the really wide ranges noted above, there are women that are two or three NTRP levels lower than some men, that have better WTN ratings than the men.  For example, a 3.5 woman with a WTN that is 2 levels better than a 4.5 man.  It seems improbable that this is accurate, but NTRP has its edge case issues too so I wouldn't throw WTN out because of this, but something seems off.

Note that WTN has said it uses data from four years back, and perhaps even started using data six or more years back, so I don't think it being new and lack of data is an explanation.  I would have thought the observations I'm making would have been made during development and addressed if possible.  Perhaps WTN doesn't consider any of these to be an issue.

Given all these observations, I'm not sure a USTA League player will find WTN terribly valuable.  It is clearly different than NTRP, sometimes significantly higher or lower than expected, and your WTN going up or down or being at a certain level may give no indication of whether you will be bumped up or not which is what a lot of players care about.  Since the USTA says NTRP isn't going away, WTN may be just a curiosity for players to follow for fun.

And since it appears there are some clear inconsistencies in the mapping and ranges, the promise of the WTN Game zONe being a good indicator of a competitive match may or may not be accurate right now.

What do you think?  How much attention are you going to pay to your WTN?

Wednesday, June 15, 2022

Week two of WTN for USTA players

It has been a week since the USTA began publishing WTN ratings for players, and right on schedule it appears the weekly update is being pushed out.  Kudos to the USTA and ITF for sticking to the schedule as advertised.

I've only been able to do some anecdotal analysis of the new ratings so far, more details to hopefully come soon, but I've noticed a few things.

  • WTN ratings can change without a player playing additional matches
  • Correspondingly Game zONe and confidence levels can also change without a player playing additional matches
  • For these players, changes I've seen are generally in the single-digit hundredths, but there are some with larger changes, the highest I've seen is 0.7.

It is entirely possible that while a player did not play more matches, something was reported late to the USTA or ITF and so the algorithm sees it as an additional match and that is why a rating changed.  However, I suspect instead there is some factor in the algorithm that gives more weight to recent results, or has results in the past fall outside the window being considered, or there is an iterative/retroactive aspect to the algorithm and while a player didn't play, someone they played in the past did play and that affected their rating.

If you want to see all the analysis I've done on WTN recently, click here and scroll through the list of posts.

Tuesday, June 14, 2022

Mapping NTRP to WTN - Men's Singles

The USTA has started publishing the new World Tennis Number (WTN) for players on their profile on usta.com.  I did some initial analysis last week, but now I'm able to do a bit more thorough and complete analysis so this begins a sequence of posts with that.

My process for this analysis is to look at a sampling of the USTA League players who received a 2021 year-end rating/level and look at the distribution of WTN levels for each different gender, discipline, and level.

By limiting it to players that received a 2021 year-end level I am only considering players who got a recent NTRP rating so that I'm not using stale NTRP levels in the comparison.  Since NTRP levels don't change again until the end of 2022, a player could have gotten better or worse in the 6 months since ratings were published, but this seems like a reasonably consistent approach to doing this mapping.  Ideally this would be done using WTN ratings at the time the NTRP ratings were published, but we haven't had WTN until now so we do the best we can.

In this post, I'm looking at WTN's for men's singles.  What you will see in the chart below is the count of players at each WTN level in total and broken out by NTRP level.  So let's have it!

(You likely want to click on the image to see a larger version of it).

Looking at the general distribution, we see a typical normal distribution to the left of WTN 17, but there isn't a true peak in the middle, and to the right of 20 there are some strange drops.  It is certainly possible this is the skill level profile of men doubles players, and it could be that the players that would be in the long tail aren't playing competitive matches and so aren't in the WTN system.

We also see as you move towards single digits, there isn't the strange blip that the men's doubles had.

The above is including all players, but if we limit it to those with a high confidence WTN, the chart is at follows:

This looks a bit better, but a bit of a strange step from 14 to 16.

What is dramatically different is the number of players drops over 85%.  It would seem there are very few men that play enough singles to get a high confidence WTN.

But looking at the counts by NTRP level is what is perhaps more interesting and will give us an idea of how NTRP maps to WTN.

As you'd expect there is for the most part the expected normal curve within each level.  What is perhaps not expected is how wide each NTRP level is, although they are probably not as wide as the doubles players.

There are a reasonable number of 4.5s ranging from 8 to 18 with an average of 13.6 and standard deviation of 2.6.  If there was a direct and perfect conversion from NTRP to WTN, one NTRP level would correlate with about 4 WTN levels, so having a range of 11 levels, with some players even beyond that, seems a little high.

Similarly, there are a reasonable number of 4.0s from 11 to 22 with an average of 16.8 and standard deviation of 2.6.  Both of these are significantly better than the doubles and approaching what you might expect.

The other thing you can do with this chart is see how many NTRP levels there are at a given WTN level.  For example WTN 18 has a noticeable segment of players from 3.0 to 4.5.  If WTN were to be right, it would be saying the 3.0s and 4.5s with WTN 18 would have a competitive match.  That seems hard to believe, but perhaps there really are some edge cases where a 4.5 has a rating that is lagging their ability and the WTN algorithm, despite being updated weekly, hasn't caught up with the 3.0's improvement.  But this seems a bit of a stretch.

Which is right?  WTN or NTRP?  I can't say at this point, but stay tuned, I'll keep doing analysis.  And it is likely that neither is "right" or "wrong" and they are just different.

See the prior post for Women's DoublesMen's Doubles, and Women's Singles.

Monday, June 13, 2022

Mapping NTRP to WTN - Women's Singles

The USTA has started publishing the new World Tennis Number (WTN) for players on their profile on usta.com.  I did some initial analysis last week, but now I'm able to do a bit more thorough and complete analysis so this begins a sequence of posts with that.

My process for this analysis is to look at a sampling of the USTA League players who received a 2021 year-end rating/level and look at the distribution of WTN levels for each different gender, discipline, and level.

By limiting it to players that received a 2021 year-end level I am only considering players who got a recent NTRP rating so that I'm not using stale NTRP levels in the comparison.  Since NTRP levels don't change again until the end of 2022, a player could have gotten better or worse in the 6 months since ratings were published, but this seems like a reasonably consistent approach to doing this mapping.  Ideally this would be done using WTN ratings at the time the NTRP ratings were published, but we haven't had WTN until now so we do the best we can.

In this post, I'm looking at WTN's for women's singles.  What you will see in the chart below is the count of players at each WTN level in total and broken out by NTRP level.  So let's have it!

(You likely want to click on the image to see a larger version of it).

Looking at the general distribution, we see a typical normal distribution to the left of WTN 22, but to the right, things aren't quite as prototypical.  There is a spike at 23 and then several big dropps to 24, then to 26 and also 28.  It is also missing the long tail we expect from a normal distribution.  It is certainly possible this is the skill level profile of women doubles players, and it could be that the players that would be in the long tail aren't playing competitive matches and so aren't in the WTN system.

We also see as you move towards single digits, the women fall off quickly.  Below WTN 15, there are very few players, even though the peak in singles is 23 while the doubles peak was 25, the long tail to the left was more significant for doubles than singles.

The above is including all players, but if we limit it to those with a high confidence WTN, the chart is at follows:

This isn't dramatically different looking, but the peak is at 20 instead of 23.  What is dramatically different is the number of players drops over 80%.  It would seem there are very few women that play enough singles to get a high confidence WTN.

But looking at the counts by NTRP level is what is perhaps more interesting and will give us an idea of how NTRP maps to WTN.

As you'd expect there is for the most part the expected normal curve within each level, perhaps more so than the doubles analysis.  What is perhaps not expected is how wide each NTRP level is, although they are probably not as wide as the doubles players.

There are a reasonable number of 4.5s ranging from 13 to 22 (10 to 24 for all) with an average of 16.1 and standard deviation of 2.2.  If there was a direct and perfect conversion from NTRP to WTN, one NTRP level would correlate with about 4 WTN levels, so having a range of 10 levels, with some players even beyond that, seems a little high.

Similarly, there are a reasonable number of 4.0s from 15 to 26 (12 to 29 for all) with an average of 18.6 and standard deviation of 1.9.  Both of these are significantly better than the doubles and approaching what you might expect.

The other thing you can do with this chart is see how many NTRP levels there are at a given WTN level.  For example WTN 21 has a noticeable segment of players from 3.0 to 4.5.  If WTN were to be right, it would be saying the 3.0s and 4.5s with WTN 21 would have a competitive match.  That seems hard to believe, but perhaps there really are some edge cases where a 4.5 has a rating that is lagging their ability and the WTN algorithm, despite being updated weekly, hasn't caught up with the 3.0's improvement.  But this seems a bit of a stretch.

Which is right?  WTN or NTRP?  I can't say at this point, but stay tuned, I'll keep doing analysis.  And it is likely that neither is "right" or "wrong" and they are just different.

See the prior post for Women's Doubles and Men's Doubles, and stay tuned for the Men's Singles.

An Update to Mapping NTRP to WTN - Men's Doubles

I wrote yesterday about some observations from looking at WTN levels by NTRP level for men's doubles players, and key one being that the range of WTN levels for each NTRP level was quite large.  After giving it some thought, I wanted to make a slight adjustment to what I did to perhaps be a little more fair to WTN.

My approach to the analysis was to look at players that had a 2021 year-end 'C' NTRP rating/level and look at the WTN's for these players.  The idea was that if the USTA was able to give the player a 'C' rating, it should be good enough for WTN too.

Upon looking deeper, I found that a fair number of players I looked at did not have the blue check mark indicating a high confidence in their WTN rating, so it only seemed fair to take another pass at the analysis to only include those with a high confidence.

On to the new chart then!

For comparison, here is the previous chart.

The new chart does look a bit better, the unexpected non-"normal" right half of the chart now looks better.  So it would seem it was low-confidence players that were not rated accurately and causing the strange looking chart.

However, we see the counts at each WTN are down nearly 50%, so about half of the players that NTRP found had enough matches to get a 'C' rating, WTN does not give a high confidence rating.  Now, this is in part because WTN calculates separate singles and doubles ratings so that shouldn't be ignored.

The blip at WTN 3 is still there, but it is smaller and there are fewer 4.5s in it which is an improvement.

While the above few things are improved, the ranges of WTN levels for an NTRP level are still very large.  At 4.0 for example, we still see significant numbers of players from 13 to 27 which is 15 levels, still more than I would have expected.

Standard deviations came down a bit but are still pretty large:

  • 3.0 went from 3.4 to 3.2
  • 3.5 went from 3.7 to 3.4
  • 4.0 went from 4.1 to 3.5
  • 4.5 went from 4.9 to 3.9

A modest to decent improvement, but still larger than I'd expect.

So I thought it was only fair to use high confidence WTNs, but the observations don't change a whole lot.

What do you think?

An Update to Mapping NTRP to WTN - Women's Doubles

I wrote yesterday about some observations from looking at WTN levels by NTRP level for women's doubles players, and key one being that the range of WTN levels for each NTRP level was quite large.  After giving it some thought, I wanted to make a slight adjustment to what I did to perhaps be a little more fair to WTN.

My approach to the analysis was to look at players that had a 2021 year-end 'C' NTRP rating/level and look at the WTN's for these players.  The idea was that if the USTA was able to give the player a 'C' rating, it should be good enough for WTN too.

Upon looking deeper, I found that a fair number of players I looked at did not have the blue check mark indicating a high confidence in their WTN rating, so it only seemed fair to take another pass at the analysis to only include those with a high confidence.

On to the new chart then!


For comparison, here is the previous chart.


The new chart does look better, the unexpected non-"normal" right half of the chart now looks better.  So it would seem it was low-confidence players that were not rated accurately and causing the strange looking chart.

However, we see the counts at each WTN are down over 30%, so about half of the players that NTRP found had enough matches to get a 'C' rating, WTN does not give a high confidence rating.  Now, this is in part because WTN calculates separate singles and doubles ratings so that shouldn't be ignored.

While the above few things are improved, the ranges of WTN levels for an NTRP level are still very large.  At 4.0 for example, we still see significant numbers of players from 15 to 26 which is 12 levels, still more than I would have expected.

Standard deviations came down a bit but are still pretty large:

  • 3.0 went from 3.1 to 2.9
  • 3.5 went from 3.2 to 2.9
  • 4.0 went from 3.3 to 3.0
  • 4.5 went from 3.8 to 3.3

A small/modest improvement, but not a dramatic change.

So I thought it was only fair to use high confidence WTNs, but the observations don't change a whole lot.

What do you think?