Thursday, June 16, 2022

Observations about WTN for USTA League players after two weeks of publishing

The USTA has now published the new World Tennis Number (WTN) for two weeks, so there is enough data, real and anecdotal, to begin making some observations.

As stated by the ITF/USTA, a player's WTN is supposed to be updated once a week on Wednesday and incorporate matches through the previous Sunday.  So far we've had the ratings published on Wednesdays so they are hitting that, but are they incorporating the matches they say there are?

That leads to the first observation which is it appears there may be a delay in matches being included.  I've seen several cases where players played matches between 6/5 (the previous Sunday) and 6/12 (the Sunday before the 6/15 publish) and all indications are those matches were not included in the calculations.  And yes, I checked, the matches were entered on time and prior to 6/12.

This may just be growing pains and it will get ironed out, but if a player's matches are not consistently included, it makes it difficult to follow one's WTN and derive any meaning or benefit from tracking it.

But on the other side, there are players that played no new matches, and their WTN changed from week one to week two.  This isn't right or wrong, but does perhaps give us some insight into the algorithm.  It is possible that the matches included actually changed due to catching up, despite none being played, and this would be another indication of some growing pains in the data collection, but it is perhaps more likely that the WTN algorithm simply allows for changes to a player's rating when they don't play due to prior opponents playing and their rating going up or down.

It is a classic debate with ratings algorithms on if one should only look at the rating of the opponent at the time of the match, or if their future results, and them getting better or worse, should have an effect on your rating.  As I understand it, the dynamic NTRP rating calculated throughout the year only looks  at the rating at the time of the match and your dynamic rating won't change if you don't play.  UTR on the other hand will have a player's rating change when they don't play (wildly so at times) and it appears WTN is following UTR's lead in this area.  I have not seen wild swings with WTN, but we only have one update's worth of data.

A big observation I perhaps should have led with is that in mapping NTRP to WTN, you end up with some extremely large ranges.  I showed this in my analysis of singles (men and women) and doubles (men and women) mappings, but the summary is that doubles has a wider range than singles, but even singles has one NTRP level mapping to 10+ WTN levels (doubles has 15+).  A side-effect of this is that some WTN levels have noticeable representation from 3-4 NTRP levels.  If WTN is to be believed, these players from the 3-4 NTRP levels would have a competitive match together.

Another observation is that some players have dramatically different singles and doubles WTNs.  Differences of 5 are not uncommon at all and there are some that are 10 or more different.  And from personal experience, some of the differences are way off and perhaps even backwards.

Note that WTN is arguably better than NTRP in that it calculates separate ratings for singles and doubles, and the majority of players are probably better at one than the other.  The question is if what the WTN is calculating reflects reality or not.  Perhaps it does overall but there are at least some exceptions that make you go hmmm.  But the separate ratings can also be an explanation for some, but not all, of these observations.

WTN has been positioned as a gender neutral rating, but from what we are seeing, in part due to the really wide ranges noted above, there are women that are two or three NTRP levels lower than some men, that have better WTN ratings than the men.  For example, a 3.5 woman with a WTN that is 2 levels better than a 4.5 man.  It seems improbable that this is accurate, but NTRP has its edge case issues too so I wouldn't throw WTN out because of this, but something seems off.

Note that WTN has said it uses data from four years back, and perhaps even started using data six or more years back, so I don't think it being new and lack of data is an explanation.  I would have thought the observations I'm making would have been made during development and addressed if possible.  Perhaps WTN doesn't consider any of these to be an issue.

Given all these observations, I'm not sure a USTA League player will find WTN terribly valuable.  It is clearly different than NTRP, sometimes significantly higher or lower than expected, and your WTN going up or down or being at a certain level may give no indication of whether you will be bumped up or not which is what a lot of players care about.  Since the USTA says NTRP isn't going away, WTN may be just a curiosity for players to follow for fun.

And since it appears there are some clear inconsistencies in the mapping and ranges, the promise of the WTN Game zONe being a good indicator of a competitive match may or may not be accurate right now.

What do you think?  How much attention are you going to pay to your WTN?

3 comments:

  1. Excellent synopsis. Totally agree with the take home message: In current state, WTN has little value for adult league players.

    Interestingly, since WTN was released I’ve now started registering for more UTR leagues. Probably not the outcome USTA was striving for…

    ReplyDelete
    Replies
    1. Probably not! It will be interesting to see how WTN is used, it seems it will be rolled out slowly and not big bang.

      Delete
  2. After nearly six months of seeing how WTN works it's pretty clear that a player's initial rating number has a significant and persistent effect on the player's rating. The algorithm appears to keep the rating near that initial point, which makes the ratings highly resistant to change, even if the player keeps beating much better ranked opponents. Different starting points and resistance to change likely explains the big disparity in ratings between players of similar skill.

    ReplyDelete