Wednesday, June 29, 2022

Should WTN play a role in USTA NTRP leagues?

The fourth week of the new ITF run WTN ratings have been published, and I will try to do some more analysis soon.  But a related topic came to mind I wanted to write about as well.

For USTA League players, the carrot that is USTA League Nationals, and the chance to go through the process to win your local league, Districts, Sectionals, and advance on to Nationals, is something that becomes a goal for a fair number of players and captains.  This can promote friendly and competitive play, but taken to extremes, can result in what is considered by some to be going too far and undesirable behavior.

The NTRP level based system the USTA uses implicitly gives an advantage to teams that have players with abilities in the upper part of their level.  A 3.5 team with players in the 3.40's or higher is going to beat a team with "average" 3.5s in the 3.20s and 3.30s more often than not.  And generally speaking, to advance in playoffs you need to have players playing above level and on their way to being bumped up, so a Sectionals/Nationals 3.5 team will likely have at least some players in the 3.60s, 3.70s, or higher.  And to win Nationals, you likely need a line-up that is all on their way to being bumped up.

Some of this can happen naturally, e.g. someone who was a legitimate 3.5 last year and got a 3.5C decides to get better and puts in the work with lessons, drills, and increased play, and they naturally get better and by year-end their dynamic rating is around 3.70 and they get bumped up at year-end.  If a captain can recruit and motivate a group of players with this potential for improvement, they can be a Nationals caliber team.

But strong teams can form unnaturally too, e.g. someone that is a 4.0C from 2 years ago took it "easy" or perhaps deliberately lost last year to get bumped down to 3.5C and joins with a few others that did the same to have a 3.5 super team that can be Nationals caliber.  While natural bump downs can happen, when they are manipulated like I noted, most agree this violates the spirit of the rules for sure, and perhaps even some specific rules.

Another way super teams form at times is with brand new players that are able to self-rate at the desired level.  The USTA has self-rate guidelines that dictate the minimum level for a player based on their playing history, but they are far from perfect and can't factor in everything that might identify how good of a player someone is, and there are routinely situations where someone with "4.0" ability is able to self-rate as a 3.5 and if several of these get together, or are sprinkled in with players described above, they can also form a Nationals caliber team.

Self-rating "too low" as allowed by the guidelines is not a violation of any written rule, but some would argue that a captain/player probably knows they are out of level, or will figure that out after a few matches, and should opt to only play up going forward once they figure it out.  But not everyone feels that way and there are pressures from teammates to not bail on them, this is their chance at Nationals after all.

Now, the USTA will say the three-strike DQ process is in place to catch these players, but in my opinion, the threshold are too high and the minimum matches too low and that allows players to hide and make it to Sectionals and Nationals when they really are clearly out of level.

With the advent of WTN, I think there is an opportunity for the USTA to do something to address the self-rate issue.  Many players that self-rate will have played in high school or juniors, and if WTN does what it is purported to, these players will have a WTN rating.  I would posit that an actual rating based on match results should trump, or at least play a factor in the self-rate guidelines such that a player with a WTN of N or better can self-rate no lower than 4.0 for example.

I would think this would help tighten up the self-rate process and avoid some of the blatant under self-rating that goes on, and get players at the right level sooner.

Now, for this to work, WTN needs to work and be equitable across all players and there are some early indications there may be some challenges there (womens singles, womens doublesmens singles, mens doubles), but hopefully that will improve, and even as-is I think using WTN in the self-rating process could help.

Note, UTR has been around and arguably could have been used in self-rating for years now, but the USTA was never going to validate UTR by doing so.  They have WTN now, so make it happen!

What do you think?  Should a player's WTN play a factor in what they can self-rate at?

Thursday, June 16, 2022

What does WTN do with professionals, and what is the Pro Zone?

The World Tennis Number (WTN) from the ITF is now being published on USTA player profiles, is advertised as a rating going from 40 for a beginner, to 1 for a pro.  But what exactly does that mean?  What are professional's WTNs?  And how can I relate a pro's WTN to your own?

First, from the various FAQs on WTN, there is the text "Pro players will be closer to 1".  This isn't explicit, but with the FAQ preceding this statement with the range being 40 to 1, one might think that the very top pro is a 1.0 and other pros would be in the 1.x or perhaps even 2.x range.

Second, in my perusal of WTNs, the smallest I've seen is 1.1, this for a male unrated (NTRP) player from Mississippi.  There are actually very few in the 1's, but the 2's has a fair number.  So if a pro is a 1, is this 1.1 on par with pros?

When I did my chart showing a hypothetical mapping from NTRP to WTN, I left a little room for pros (NTRP 7.0) to live in the 1-3 WTN range.  My guess at the mapping has been proven wrong with the ranges much larger with huge overlap, but if there is no room for the pros, what is done for them?

It turns out, the USTA at least, is not publishing WTNs for pros.  If you go lookup Jack Sock for example, you won't see a WTN showing for singles or doubles, but instead it has the meter maxed out and "Pro Zone" text in the middle of the widget.

It appears what happens is anyone that is an actual "pro", this based on having an ATP/WTA ranking, will simply be listed as a "Pro" and that appears to be what "Pro Zone" means.  This from an LTA web-site:

A world-wide rating system that ranges from 40 (recreational players) to 1 (pro players) - players with an ATP/WTA ranking will be listed as PRO

What I'm not sure of is if the matches professionals play are part of the WTN dataset or not.  You'd think they would be, in which case my guess is the WTNs could go below 1.0 and perhaps even negative.  Perhaps this is why they don't show them and instead just show "Pro Zone".

But what will happen when Jack quits playing and his ATP ranking goes away?  Shouldn't his WTN then become visible?  What will it be?

This exposes a bit of a challenge for any rating system where there are arbitrary upper and lower thresholds.  What happens if a 39.9 does so poorly that their calculated WTN goes above 40?  What happens when a player, recreational player or pro, wins so much their WTN goes below 1?  Does the whole range have to be recalibrated and everyone's WTN changes to keep the integrity of the 40 and 1 boundaries?

It appears the ITF (or USTA at least) is punting on the subject and just not showing pro's WTNs.  And beginners, at this point at least, aren't playing any matches that get into the system so we don't have to worry about that yet.

Although on this last point, I do see some NTRP 2.5s  with WTNs of 40.0, not many, but a few.  So it appears this boundary is enforced, at least from what is published.

And in another "Whacky WTN" moment, I found a male player who is a self-rated NTRP 5.5, and in singles they show as "Pro Zone", but in doubles they are a measly 22.3 which is the WTN you might see for an NTRP 3.5 or even 3.0 player!  Similarly I found a female player with no NTRP that is "Pro Zone" in singles but a WTN 17.4 in doubles.  And there are others that are Pro Zone in singles or doubles and a double digit WTN in the other.  And some of these wild differences have the high confidence blue checkmark as well.

I'm sure some of what I'm seeing are exceptions or edge cases, but I'm coming across a fair amount.  It is interesting to say the least.

Observations about WTN for USTA League players after two weeks of publishing

The USTA has now published the new World Tennis Number (WTN) for two weeks, so there is enough data, real and anecdotal, to begin making some observations.

As stated by the ITF/USTA, a player's WTN is supposed to be updated once a week on Wednesday and incorporate matches through the previous Sunday.  So far we've had the ratings published on Wednesdays so they are hitting that, but are they incorporating the matches they say there are?

That leads to the first observation which is it appears there may be a delay in matches being included.  I've seen several cases where players played matches between 6/5 (the previous Sunday) and 6/12 (the Sunday before the 6/15 publish) and all indications are those matches were not included in the calculations.  And yes, I checked, the matches were entered on time and prior to 6/12.

This may just be growing pains and it will get ironed out, but if a player's matches are not consistently included, it makes it difficult to follow one's WTN and derive any meaning or benefit from tracking it.

But on the other side, there are players that played no new matches, and their WTN changed from week one to week two.  This isn't right or wrong, but does perhaps give us some insight into the algorithm.  It is possible that the matches included actually changed due to catching up, despite none being played, and this would be another indication of some growing pains in the data collection, but it is perhaps more likely that the WTN algorithm simply allows for changes to a player's rating when they don't play due to prior opponents playing and their rating going up or down.

It is a classic debate with ratings algorithms on if one should only look at the rating of the opponent at the time of the match, or if their future results, and them getting better or worse, should have an effect on your rating.  As I understand it, the dynamic NTRP rating calculated throughout the year only looks  at the rating at the time of the match and your dynamic rating won't change if you don't play.  UTR on the other hand will have a player's rating change when they don't play (wildly so at times) and it appears WTN is following UTR's lead in this area.  I have not seen wild swings with WTN, but we only have one update's worth of data.

A big observation I perhaps should have led with is that in mapping NTRP to WTN, you end up with some extremely large ranges.  I showed this in my analysis of singles (men and women) and doubles (men and women) mappings, but the summary is that doubles has a wider range than singles, but even singles has one NTRP level mapping to 10+ WTN levels (doubles has 15+).  A side-effect of this is that some WTN levels have noticeable representation from 3-4 NTRP levels.  If WTN is to be believed, these players from the 3-4 NTRP levels would have a competitive match together.

Another observation is that some players have dramatically different singles and doubles WTNs.  Differences of 5 are not uncommon at all and there are some that are 10 or more different.  And from personal experience, some of the differences are way off and perhaps even backwards.

Note that WTN is arguably better than NTRP in that it calculates separate ratings for singles and doubles, and the majority of players are probably better at one than the other.  The question is if what the WTN is calculating reflects reality or not.  Perhaps it does overall but there are at least some exceptions that make you go hmmm.  But the separate ratings can also be an explanation for some, but not all, of these observations.

WTN has been positioned as a gender neutral rating, but from what we are seeing, in part due to the really wide ranges noted above, there are women that are two or three NTRP levels lower than some men, that have better WTN ratings than the men.  For example, a 3.5 woman with a WTN that is 2 levels better than a 4.5 man.  It seems improbable that this is accurate, but NTRP has its edge case issues too so I wouldn't throw WTN out because of this, but something seems off.

Note that WTN has said it uses data from four years back, and perhaps even started using data six or more years back, so I don't think it being new and lack of data is an explanation.  I would have thought the observations I'm making would have been made during development and addressed if possible.  Perhaps WTN doesn't consider any of these to be an issue.

Given all these observations, I'm not sure a USTA League player will find WTN terribly valuable.  It is clearly different than NTRP, sometimes significantly higher or lower than expected, and your WTN going up or down or being at a certain level may give no indication of whether you will be bumped up or not which is what a lot of players care about.  Since the USTA says NTRP isn't going away, WTN may be just a curiosity for players to follow for fun.

And since it appears there are some clear inconsistencies in the mapping and ranges, the promise of the WTN Game zONe being a good indicator of a competitive match may or may not be accurate right now.

What do you think?  How much attention are you going to pay to your WTN?

Wednesday, June 15, 2022

Week two of WTN for USTA players

It has been a week since the USTA began publishing WTN ratings for players, and right on schedule it appears the weekly update is being pushed out.  Kudos to the USTA and ITF for sticking to the schedule as advertised.

I've only been able to do some anecdotal analysis of the new ratings so far, more details to hopefully come soon, but I've noticed a few things.

  • WTN ratings can change without a player playing additional matches
  • Correspondingly Game zONe and confidence levels can also change without a player playing additional matches
  • For these players, changes I've seen are generally in the single-digit hundredths, but there are some with larger changes, the highest I've seen is 0.7.

It is entirely possible that while a player did not play more matches, something was reported late to the USTA or ITF and so the algorithm sees it as an additional match and that is why a rating changed.  However, I suspect instead there is some factor in the algorithm that gives more weight to recent results, or has results in the past fall outside the window being considered, or there is an iterative/retroactive aspect to the algorithm and while a player didn't play, someone they played in the past did play and that affected their rating.

If you want to see all the analysis I've done on WTN recently, click here and scroll through the list of posts.

Tuesday, June 14, 2022

Mapping NTRP to WTN - Men's Singles

The USTA has started publishing the new World Tennis Number (WTN) for players on their profile on usta.com.  I did some initial analysis last week, but now I'm able to do a bit more thorough and complete analysis so this begins a sequence of posts with that.

My process for this analysis is to look at a sampling of the USTA League players who received a 2021 year-end rating/level and look at the distribution of WTN levels for each different gender, discipline, and level.

By limiting it to players that received a 2021 year-end level I am only considering players who got a recent NTRP rating so that I'm not using stale NTRP levels in the comparison.  Since NTRP levels don't change again until the end of 2022, a player could have gotten better or worse in the 6 months since ratings were published, but this seems like a reasonably consistent approach to doing this mapping.  Ideally this would be done using WTN ratings at the time the NTRP ratings were published, but we haven't had WTN until now so we do the best we can.

In this post, I'm looking at WTN's for men's singles.  What you will see in the chart below is the count of players at each WTN level in total and broken out by NTRP level.  So let's have it!

(You likely want to click on the image to see a larger version of it).

Looking at the general distribution, we see a typical normal distribution to the left of WTN 17, but there isn't a true peak in the middle, and to the right of 20 there are some strange drops.  It is certainly possible this is the skill level profile of men doubles players, and it could be that the players that would be in the long tail aren't playing competitive matches and so aren't in the WTN system.

We also see as you move towards single digits, there isn't the strange blip that the men's doubles had.

The above is including all players, but if we limit it to those with a high confidence WTN, the chart is at follows:

This looks a bit better, but a bit of a strange step from 14 to 16.

What is dramatically different is the number of players drops over 85%.  It would seem there are very few men that play enough singles to get a high confidence WTN.

But looking at the counts by NTRP level is what is perhaps more interesting and will give us an idea of how NTRP maps to WTN.

As you'd expect there is for the most part the expected normal curve within each level.  What is perhaps not expected is how wide each NTRP level is, although they are probably not as wide as the doubles players.

There are a reasonable number of 4.5s ranging from 8 to 18 with an average of 13.6 and standard deviation of 2.6.  If there was a direct and perfect conversion from NTRP to WTN, one NTRP level would correlate with about 4 WTN levels, so having a range of 11 levels, with some players even beyond that, seems a little high.

Similarly, there are a reasonable number of 4.0s from 11 to 22 with an average of 16.8 and standard deviation of 2.6.  Both of these are significantly better than the doubles and approaching what you might expect.

The other thing you can do with this chart is see how many NTRP levels there are at a given WTN level.  For example WTN 18 has a noticeable segment of players from 3.0 to 4.5.  If WTN were to be right, it would be saying the 3.0s and 4.5s with WTN 18 would have a competitive match.  That seems hard to believe, but perhaps there really are some edge cases where a 4.5 has a rating that is lagging their ability and the WTN algorithm, despite being updated weekly, hasn't caught up with the 3.0's improvement.  But this seems a bit of a stretch.

Which is right?  WTN or NTRP?  I can't say at this point, but stay tuned, I'll keep doing analysis.  And it is likely that neither is "right" or "wrong" and they are just different.

See the prior post for Women's DoublesMen's Doubles, and Women's Singles.

Monday, June 13, 2022

Mapping NTRP to WTN - Women's Singles

The USTA has started publishing the new World Tennis Number (WTN) for players on their profile on usta.com.  I did some initial analysis last week, but now I'm able to do a bit more thorough and complete analysis so this begins a sequence of posts with that.

My process for this analysis is to look at a sampling of the USTA League players who received a 2021 year-end rating/level and look at the distribution of WTN levels for each different gender, discipline, and level.

By limiting it to players that received a 2021 year-end level I am only considering players who got a recent NTRP rating so that I'm not using stale NTRP levels in the comparison.  Since NTRP levels don't change again until the end of 2022, a player could have gotten better or worse in the 6 months since ratings were published, but this seems like a reasonably consistent approach to doing this mapping.  Ideally this would be done using WTN ratings at the time the NTRP ratings were published, but we haven't had WTN until now so we do the best we can.

In this post, I'm looking at WTN's for women's singles.  What you will see in the chart below is the count of players at each WTN level in total and broken out by NTRP level.  So let's have it!

(You likely want to click on the image to see a larger version of it).

Looking at the general distribution, we see a typical normal distribution to the left of WTN 22, but to the right, things aren't quite as prototypical.  There is a spike at 23 and then several big dropps to 24, then to 26 and also 28.  It is also missing the long tail we expect from a normal distribution.  It is certainly possible this is the skill level profile of women doubles players, and it could be that the players that would be in the long tail aren't playing competitive matches and so aren't in the WTN system.

We also see as you move towards single digits, the women fall off quickly.  Below WTN 15, there are very few players, even though the peak in singles is 23 while the doubles peak was 25, the long tail to the left was more significant for doubles than singles.

The above is including all players, but if we limit it to those with a high confidence WTN, the chart is at follows:

This isn't dramatically different looking, but the peak is at 20 instead of 23.  What is dramatically different is the number of players drops over 80%.  It would seem there are very few women that play enough singles to get a high confidence WTN.

But looking at the counts by NTRP level is what is perhaps more interesting and will give us an idea of how NTRP maps to WTN.

As you'd expect there is for the most part the expected normal curve within each level, perhaps more so than the doubles analysis.  What is perhaps not expected is how wide each NTRP level is, although they are probably not as wide as the doubles players.

There are a reasonable number of 4.5s ranging from 13 to 22 (10 to 24 for all) with an average of 16.1 and standard deviation of 2.2.  If there was a direct and perfect conversion from NTRP to WTN, one NTRP level would correlate with about 4 WTN levels, so having a range of 10 levels, with some players even beyond that, seems a little high.

Similarly, there are a reasonable number of 4.0s from 15 to 26 (12 to 29 for all) with an average of 18.6 and standard deviation of 1.9.  Both of these are significantly better than the doubles and approaching what you might expect.

The other thing you can do with this chart is see how many NTRP levels there are at a given WTN level.  For example WTN 21 has a noticeable segment of players from 3.0 to 4.5.  If WTN were to be right, it would be saying the 3.0s and 4.5s with WTN 21 would have a competitive match.  That seems hard to believe, but perhaps there really are some edge cases where a 4.5 has a rating that is lagging their ability and the WTN algorithm, despite being updated weekly, hasn't caught up with the 3.0's improvement.  But this seems a bit of a stretch.

Which is right?  WTN or NTRP?  I can't say at this point, but stay tuned, I'll keep doing analysis.  And it is likely that neither is "right" or "wrong" and they are just different.

See the prior post for Women's Doubles and Men's Doubles, and stay tuned for the Men's Singles.

An Update to Mapping NTRP to WTN - Men's Doubles

I wrote yesterday about some observations from looking at WTN levels by NTRP level for men's doubles players, and key one being that the range of WTN levels for each NTRP level was quite large.  After giving it some thought, I wanted to make a slight adjustment to what I did to perhaps be a little more fair to WTN.

My approach to the analysis was to look at players that had a 2021 year-end 'C' NTRP rating/level and look at the WTN's for these players.  The idea was that if the USTA was able to give the player a 'C' rating, it should be good enough for WTN too.

Upon looking deeper, I found that a fair number of players I looked at did not have the blue check mark indicating a high confidence in their WTN rating, so it only seemed fair to take another pass at the analysis to only include those with a high confidence.

On to the new chart then!

For comparison, here is the previous chart.

The new chart does look a bit better, the unexpected non-"normal" right half of the chart now looks better.  So it would seem it was low-confidence players that were not rated accurately and causing the strange looking chart.

However, we see the counts at each WTN are down nearly 50%, so about half of the players that NTRP found had enough matches to get a 'C' rating, WTN does not give a high confidence rating.  Now, this is in part because WTN calculates separate singles and doubles ratings so that shouldn't be ignored.

The blip at WTN 3 is still there, but it is smaller and there are fewer 4.5s in it which is an improvement.

While the above few things are improved, the ranges of WTN levels for an NTRP level are still very large.  At 4.0 for example, we still see significant numbers of players from 13 to 27 which is 15 levels, still more than I would have expected.

Standard deviations came down a bit but are still pretty large:

  • 3.0 went from 3.4 to 3.2
  • 3.5 went from 3.7 to 3.4
  • 4.0 went from 4.1 to 3.5
  • 4.5 went from 4.9 to 3.9

A modest to decent improvement, but still larger than I'd expect.

So I thought it was only fair to use high confidence WTNs, but the observations don't change a whole lot.

What do you think?

An Update to Mapping NTRP to WTN - Women's Doubles

I wrote yesterday about some observations from looking at WTN levels by NTRP level for women's doubles players, and key one being that the range of WTN levels for each NTRP level was quite large.  After giving it some thought, I wanted to make a slight adjustment to what I did to perhaps be a little more fair to WTN.

My approach to the analysis was to look at players that had a 2021 year-end 'C' NTRP rating/level and look at the WTN's for these players.  The idea was that if the USTA was able to give the player a 'C' rating, it should be good enough for WTN too.

Upon looking deeper, I found that a fair number of players I looked at did not have the blue check mark indicating a high confidence in their WTN rating, so it only seemed fair to take another pass at the analysis to only include those with a high confidence.

On to the new chart then!


For comparison, here is the previous chart.


The new chart does look better, the unexpected non-"normal" right half of the chart now looks better.  So it would seem it was low-confidence players that were not rated accurately and causing the strange looking chart.

However, we see the counts at each WTN are down over 30%, so about half of the players that NTRP found had enough matches to get a 'C' rating, WTN does not give a high confidence rating.  Now, this is in part because WTN calculates separate singles and doubles ratings so that shouldn't be ignored.

While the above few things are improved, the ranges of WTN levels for an NTRP level are still very large.  At 4.0 for example, we still see significant numbers of players from 15 to 26 which is 12 levels, still more than I would have expected.

Standard deviations came down a bit but are still pretty large:

  • 3.0 went from 3.1 to 2.9
  • 3.5 went from 3.2 to 2.9
  • 4.0 went from 3.3 to 3.0
  • 4.5 went from 3.8 to 3.3

A small/modest improvement, but not a dramatic change.

So I thought it was only fair to use high confidence WTNs, but the observations don't change a whole lot.

What do you think?



Another USTA League Survey - This one is better!

I am interrupting my sequence of posts on WTN to say that I got an e-mail from the USTA today with another survey.  They send these out periodically and normally they are pretty underwhelming.

This survey starts by asking some questions about participation, e.g. how often I play, lessons/drills, what non-USTA programming I play, and what USTA programs played in 2021.

The rest of the questions may be dependent on the answers to those first four, but I was then asked about my reasons for playing USTA League, what I would change, plans for the future, whether I'd recommend USTA League, and opportunity to provide feedback on the digital experience and anything else.

It then ends with some classification questions and that was it.

All in all, a better survey IMHO than some previous ones as the questions were more meaningful and there was an opportunity to write responses.  Thank you USTA!

I have no clue how the survey will be used or what change it can effect, but I'm glad the USTA is seeking feedback and hope they take it in and improve things where they can.

Did you take the survey?  If you said you played tournaments did you get a different set of questions?

Mapping NTRP to WTN - Men's Doubles

The USTA has started publishing the new World Tennis Number (WTN) for players on their profile on usta.com.  I did some initial analysis last week, but now I'm able to do a bit more thorough and complete analysis so this begins a sequence of posts with that.

My process for this analysis is to look at a sampling of the USTA League players who received a 2021 year-end rating/level and look at the distribution of WTN levels for each different gender, discipline, and level.

By limiting it to players that received a 2021 year-end level I am only considering players who got a recent NTRP rating so that I'm not using stale NTRP levels in the comparison.  Since NTRP levels don't change again until the end of 2022, a player could have gotten better or worse in the 6 months since ratings were published, but this seems like a reasonably consistent approach to doing this mapping.  Ideally this would be done using WTN ratings at the time the NTRP ratings were published, but we haven't had WTN until now so we do the best we can.

In this post, I'm looking at WTN's for men's doubles.  What you will see in the chart below is the count of players at each WTN level in total and broken out by NTRP level.  So let's have it!

(You likely want to click on the image to see a larger version of it).

Looking at the general distribution, we see a fairly typical normal distribution to the left of WTN 22, but to the right, things aren't quite as prototypical.  22 itself is more of a peak than expected, with a quick drop to 23 and 24 before a normal gradual slope, then a quick drop again, and then it is missing the long tail we expect from a normal distribution.  It is certainly possible this is the skill level profile of male doubles players, and it could be that the players that would be in the long tail aren't playing competitive matches and so aren't in the WTN system.

There is an odd blip of men at WTN 3, including some 4.5s, more at WTN 3 than at any of WTN 4 thru 8.  That seems a bit odd, both the 4.5s/5.0s and the high (relatively) number.

Comparing this to the women's chart, we see there is a different in both the WTN with the largest number, 22 for the men and 25 for the women, but also where the women had virtually no one at 10 or better, the men have a fair number including the blip at 3.

But looking at the counts by NTRP level is what is perhaps more interesting and will give us an idea of how NTRP maps to WTN.

As you'd expect there is for the most part the expected normal curve within each level, that is until you get to 3.0 which has some ups and down across WTNs that causes the oddity noted above.  What is perhaps not expected is how wide each NTRP level is.

There are a noticeable number of 4.5s ranging from 3 to 27 (2 to 31 for all) with an average of 17.2 and standard deviation of 4.9.  If there was a direct and perfect conversion from NTRP to WTN, one NTRP level would correlate with about 4 WTN levels, so having a range of 25 (!!!!) levels, with some players even beyond that, seems very high.

Similarly, there are a reasonable number of 4.0s from 10 to 30 (3 to 34 for all) with an average of 21.4 and standard deviation of 4.09.

The explanation of course is that NTRP and WTN use different algorithms and there is no direct mapping.  And WTN could argue that their algorithm is better and these large ranges are actually correct and more accurate, but that would be indicting NTRP as not being accurate then.

One key algorithm difference is that WTN calculates separate singles and doubles ratings, and that can certainly be the reason for some variance, e.g. someone could be good at doubles and bad at singles but play both.  Their NTRP will be somewhere in the middle but WTN will have the individual ratings farther apart.

Additionally, what matches the USTA includes for WTN has not, to my knowledge, been shared yet.  If WTN includes matches that NTRP does not, that could be contributor to differences between the two systems.

Back on the genders though, comparing the average WTN for males and females, we get:

  • 3.0 - women 28.2 vs men 28.4
  • 3.5 - women 24.8 vs men 24.6
  • 4.0 - women 21.9 vs men 21.4
  • 4.5 - women 18.7 vs men 17.2

There is not a significant difference between these at most levels, then men being a few hundredths better at 3.5 and 4.0, and 1.5 better at 4.5, but surprisingly the average 3.0 woman rates better than the average 3.0 man!  Something seems amiss here as there is a general consensus that on the NTRP scale the male and female of the same level, the male is about 0.5 NTRP better.  That would translate to about 2 WTN levels and we see the difference at 4.5 approaching that, the other levels it does not.  I'm not sure the gender neutrality part of WTN is fully working yet.

The other thing you can do with this chart is see how many NTRP levels there are at a given WTN level.  For example from WTN 20 to 27 there are 4 or perhaps even 5 noticeable segment of players from 2.5/3.0 to 4.5.  If WTN were to be right, it would be saying the 3.0s and a 4.5s with WTN 25 would have a competitive match.  That seems hard to believe, but perhaps there really are some edge cases where a 4.5 has a rating that is lagging their ability and the WTN algorithm, despite being updated weekly, hasn't caught up with the 3.0's improvement.  But this seems a bit of a stretch.

Which is right?  WTN or NTRP?  I can't say at this point, but stay tuned, I'll keep doing analysis.  And it is likely that neither is "right" or "wrong" and they are just different.

See the prior post on Women's Doubles, and stay tuned for this same post for the Women's Singles, and Men's Singles.

Update: I did an update that only looked at high confidence WTN ratings here.

Sunday, June 12, 2022

Mapping NTRP to WTN - Women's Doubles

The USTA has started publishing the new World Tennis Number (WTN) for players on their profile on usta.com.  I did some initial analysis last week, but now I'm able to do a bit more thorough and complete analysis so this begins a sequence of posts with that.

My process for this analysis is to look at a sampling of the USTA League players who received a 2021 year-end rating/level and look at the distribution of WTN levels for each different gender, discipline, and level.

By limiting it to players that received a 2021 year-end level I am only considering players who got a recent NTRP rating so that I'm not using stale NTRP levels in the comparison.  Since NTRP levels don't change again until the end of 2022, a player could have gotten better or worse in the 6 months since ratings were published, but this seems like a reasonably consistent approach to doing this mapping.  Ideally this would be done using WTN ratings at the time the NTRP ratings were published, but we haven't had WTN until now so we do the best we can.

In this post, I'm looking at WTN's for women's doubles.  What you will see in the chart below is the count of players at each WTN level in total and broken out by NTRP level.  So let's have it!

(You likely want to click on the image to see a larger version of it).

Looking at the general distribution, we see a typical normal distribution to the left of WTN 25, but to the right, things aren't quite as prototypical.  There is a significant drop from 25 to 26 and 27, and then is somewhat as expected through 33, but then it is missing the long tail we expect from a normal distribution.  It is certainly possible this is the skill level profile of women doubles players, and it could be that the players that would be in the long tail aren't playing competitive matches and so aren't in the WTN system.

We also see as you move towards single digits, the women fall off quickly.  While there are a good number of women with a WTN of 20, less than a quarter of that number are a WTN 15 and at 12 or lower, there are hardly any.  It will be interesting to compare this chart with the men's to see how different it is, which will give us an idea of how WTN handles gender neutrality.

But looking at the counts by NTRP level is what is perhaps more interesting and will give us an idea of how NTRP maps to WTN.

As you'd expect there is for the most part the expected normal curve within each level, that is until you get to 3.0 and 2.5, where it doesn't tail off at the higher WTNs as noted above.  What is perhaps not expected is how wide each NTRP level is.

There are a reasonable number of 4.5s ranging from 12 to 26 (4 to 32 for all) with an average of 18.7 and standard deviation of 3.8.  If there was a direct and perfect conversion from NTRP to WTN, one NTRP level would correlate with about 4 WTN levels, so having a range of 15 levels, with some players even beyond that, seems very high.

Similarly, there are a reasonable number of 4.0s from 15 to 29 (4 to 33 for all) with an average of 21.9 and standard deviation of 3.3.

The explanation of course is that NTRP and WTN use different algorithms and there is no direct mapping.  And WTN could argue that their algorithm is better and these large ranges are actually correct and more accurate, but that would be indicting NTRP as not being accurate then.

One key algorithm difference is that WTN calculates separate singles and doubles ratings, and that can certainly be the reason for some variance, e.g. someone could be good at doubles and bad at singles but play both.  Their NTRP will be somewhere in the middle but WTN will have the individual ratings farther apart.

Additionally, what matches the USTA includes for WTN has not, to my knowledge, been shared yet.  If WTN includes matches that NTRP does not, that could be contributor to differences between the two systems.

The other thing you can do with this chart is see how many NTRP levels there are at a given WTN level.  For example WTN 25 has a noticeable segment of players from 2.5 to 4.5.  If WTN were to be right, it would be saying the 2.5s and a 4.5s with WTN 25 would have a competitive match.  That seems hard to believe, but perhaps there really are some edge cases where a 4.5 has a rating that is lagging their ability and the WTN algorithm, despite being updated weekly, hasn't caught up with the 2.5's improvement.  But this seems a bit of a stretch.

Which is right?  WTN or NTRP?  I can't say at this point, but stay tuned, I'll keep doing analysis.  And it is likely that neither is "right" or "wrong" and they are just different.

And stay tuned for this same post for the Men's Doubles, Women's Singles, and Men's Singles.

Update: I posted an update that looks at only high confidence WTNs here.

Friday, June 10, 2022

Initial analysis of NTRP to WTN mapping - Much larger ranges than you'd think!

The USTA launched showing player's WTN on their profile earlier this week and I made some anecdotal observations yesterday, but have now started to do a more scientific analysis and while it is by no means complete, I wanted to share some initial results from it.

For this analysis I'm looking at male players with a 4.0C or 4.5C year-end rating for 2021.  The theory is that since NTRP is only published once a year, this is the most recent rating/level and should be the most relevant to compare with.  Yes, I'm doing the same for 4.0 and 4.5 females, it will come shortly.

With this population of players, I can then look at the range of WTNs for each level to see how things map in practice.  I've included the average and standard deviation here too:

  • 4.0 singles - range: 32.2 to 4.4, average: 17.6, standard deviation: 3.2
  • 4.0 doubles - range: 34.0 to 3.3, average: 21.4, standard deviation: 4.1
  • 4.5 singles - range: 32.3 to 5.1, average: 14.4, standard deviation: 3.4
  • 4.5 doubles - range: 31.1 to 2.4, average: 17.2, standard deviation: 4.9

The first thing that jumps out is some pretty astonishing ranges.  If a single NTRP level can span up to 30 WTN levels, it makes interpreting a WTN very difficult to say the least.  It could of course be that WTN is far better and the NTRP levels for these players is way off as well, but ranges this large make you go hmmm and require further analysis.

On the other hand, the averages do kind of make sense.  It is interesting that within a level, the singles WTN is significantly better than doubles, in fact on average this has a 4.0 singles player being rated about the same as a 4.5 doubles player.  Given there are separate singles and doubles ratings, I don't know that one can actually compare them as I'm not sure there is any correlation between them.

But the wide ranges could be due to a handful of edge cases and the general population is more consistent.  The standard deviations help here and, assuming a normal distribution, means that about 68% of 4.0 singles ratings are going to be between 20.8 and 14.4, and that is a lot more palatable, but still 68% leaves a significant number outside of that range.  And that range is wider than the Game zONe (generally about +/-2) meaning even by the USTA's definition, some portion of this 68% won't be evenly matched, let alone those outside of it.  And the doubles standard deviations are significantly higher.

But, the above is looking at all players.  The WTN does have a confidence number that is presumably based on number of matches played and gets higher (up to 100) as the algorithm considers a WTN to be more accurate.  If I do the same stats as above but for WTNs with a confidence of 70 or higher, it is better but still broad ranges.

  • 4.0 singles - range: 27.4 to 10.1, average: 16.8, standard deviation: 2.6
  • 4.0 doubles - range: 31.7 to 3.9, average: 20.4, standard deviation: 3.5
  • 4.5 singles - range: 25.7 to 6.7, average: 13.7, standard deviation: 2.6
  • 4.5 doubles - range: 28.4 to 3.3, average: 15.8, standard deviation: 3.9

It is interesting that the average gets better in every case by around a point as the confidence gets higher.  The standard deviations drop but are still above the +/-2 the Game zONe tends to be.

More analysis to come including getting this data in some charts, but I wanted to share initial observations right away.

Thursday, June 9, 2022

The USTA sends their e-mail about the WTN launch

About a month ago, the USTA sent out a bevy of e-mails about the forthcoming World Tennis Number (WTN).  Between May 10 and 13, I received no fewer than five e-mails, with several more in the subsequent couple of weeks.  Then things went silent, and in that quiet WTN's showed up on player's profiles yesterday with no e-mail announcement.

I speculated that an e-mail announcement was forthcoming, and sure enough at 7:58 AM today there it was in my inbox.  No real info in the e-mail other than a link to "See Yours Now" which navigates to your profile on usta.com.

I have not had a chance to do a detailed analysis yet, but initial observations and what I'm hearing is that for many people their WTNs are in the ballpark of what might be expected, but there are a handful of surprises where a doubles or singles WTN is remarkably higher or lower that it perhaps should be, and generally speaking, given that WTN is purported to be gender neutral, there are some female players who have a WTN that is perhaps better than you'd think compared with male players.

Here are a few examples, although I will caveat this with the statement that these are some handpicked examples from what I've seen and certainly don't represent what is happening across the board.

Some interesting gender comparisons:

  • 4.5C male with a 21.0 WTNd
  • 4.0C female with a 19.7 WTNd
  • 3.5C female with an 18.6 WTNd
  • 5.0C male with a 16.9 WTNd

This has a 3.5 female less than two points away from a 5.0 male, and both a 3.5 and 4.0 female noticeably better than a 4.5 male.  It is certainly possible that NTRP is lagging (all of these were 2021 year-end levels though) or the fact that WTN has separate ratings for singles and doubles makes it more accurate, but these are interesting.

Then some interesting ranges between singles/doubles:

  • 5.0C male with a 5.8 WTNs and 17.8 WTNd
  • 4.5C male with a 12.6 WTNs and 21.0 WTNd
  • 5.0C male with a 9.5 WTNs and 16.2 WTNd
  • 4.5C male with a 13.6 WTNs and 18.7 WTNd
  • 5.0C male with a 7.5 WTNs and a 12.3 WTNd
  • 4.5C male with a 14.0 WTNs and 18.3 WTNd
  • 4.5C male with a 16.7 WTNs and a 21.2 WTNd
  • 4.0C male with a 15.8 WTNs and a 20.2 WTNd
  • 3.0C male with a 17.0 WTNs and 12.0 WTNd
  • 4.5C male with an 18.2 WTNs and a 4.0 WTNd

I ordered these more or less from wildly better doubles WTN to wildly better singles, and there are certainly more where the singles is better, but it isn't always better.  That last one where a 4.5C is a 4.0 WTN in doubles but only an 18.2 in singles is an astonishing range.

It is certainly possible that there are ratings above with very limited results in one discipline or another, but the large ranges are interesting.

What are your observations so far?

Wednesday, June 8, 2022

The ITF World Tennis Number has been launched for USTA League players!

I've been writing about the ITF World Tennis Number for nearly three years now.  My first post was in July of 2019 when the ITF announced it, and we've been waiting for it to launch ever since.  The pandemic no doubt threw a wrench into the schedule and launch plans, but finally, nearly three years later, it appears to have launched on player's profiles on usta.com.

For those of you used to going to TennisLink for rating or other USTA League related information, that is not where you will go to see your WTN.  Instead, go to usta.com and login to your account there, and then select "Profile" from the menu in the upper right, and it should show you your WTN.  This link may take you there if you are already logged in.

What is surprising is given all the hype the USTA has been whipping up about WTN with all the recent e-mails and webinars, WTN's started showing up with no announcement.  Perhaps it is just rolling out now and an announcement will show up in the next day or two.

Note, as I've written about, you will have both a singles and doubles WTN unlike NTRP where you have just one rating/level that incorporates all play.  The "singles" and "doubles" on the WTN widget on your profile are clickable to switch between seeing the two ratings.

In my case, I play primarily doubles and have a WTN close to where I'd expect given my analysis on a hypothetical mapping between NTRP and WTN.  My NTRP is a 4.5C and my WTN is 21.0 for doubles.  What is surprising is that my singles WTN is 12.6!  While I like to play singles, at this point, I am almost certainly a better doubles player and clearly my WTN does not reflect this.  This may be an indication of some issues with the WTN algorithm.

What is also interesting is my doubles WTN does not have the Game zONe but my singles does, despite no singles matches in league play recently.

What NTRP are you and what is your singles and doubles WTN?  Do your WTN ratings make sense?  Or does something seem amiss?