Saturday, July 29, 2023

A bit more on the recent WTN algorithm change / adjustment

I wrote a few days ago about the recent significant change to WTN ratings that the USTA sent a notice out about, and many players saw their WTN get larger (worse) overnight, some in the high single digits, but others went up 10 or more.

I don't have any detailed analysis yet, but do have a few more observations.

The USTA published an FAQ about the changes that is worth a quick read, but it had a few interesting explanations.

First, some insight into why the change was made:

These enhancements to the ITF WTN algorithm will reposition cohorts of players on the current scale of 40-1 and ensure that players from all over the world are more accurately aligned, particularly with regard to different age groups. While ratings are likely to be adjusted, players will likely see movement relative to their age group.

So, they seem to have concluded that different age groups of players were rated correctly relative to each other.

Going on, they get more specific:

Adult players aged between 19-29 should expect only minor changes to their ITF World Tennis Number. Larger changes begin to occur for players over 30 years old that regularly play against players of a similar age. Adult players over 30 will see their numbers move down the scale (towards 40), this is to create some balance with Junior players and to ensure full use of the 1-40 scale.

This is consistent with my anecdotal observations that USTA League players around me, for the most part age 30+, have seen their WTN get larger and closer to 40.  I haven't looked at any juniors, but believe their statement to be true.

And from what I see, USTA League players are by and large are all in the mid-20s and higher.  I see 4.5 and 4.0 men in the mid-20s and the majority of league players are at 3.0-4.0, so that means the majority of league players are being squeezed into the 25-40 range.  That is quite a bit less granularity than we had before (15-40) so we will see how that works out.

Regarding juniors, the FAQ states:

Junior players aged 10 and under will most likely experience a movement down the scale, (towards 40) with players who are aged 17 and 18 moving up the scale (towards 1). The player ratings are now more of an accurate reflection of age, with a general incremental increase expected from aged 10 upwards.

I've also heard from players in the UK and US looking at comparisons with players in Europe that prior to this change, there seemed to be a disconnect in that what appeared to be similarly skilled players from Europe and the US had significantly different WTN ratings, those in the US having lower numbers, so this adjustment seems to have addressed that, at least somewhat.

All of this highlights the challenge of a single algorithm that tries to put all tennis players in the world on the same scale.  Whether it is age groups or countries or other geographic grouping, you will inevitably have islands, or cohorts as the FAQ says, that all play each other, but have little to no play with other islands or cohorts.  When this happens, the algorithm has very little data to go on to reasonably ensure players from those different groups are appropriately rated relative to each other.

This is probably the biggest challenge for the WTN, and as such one would have hoped a lot of effort would have gone into reviewing this and getting it right prior to launch.  If adult USTA League players were rated in the teens and similar recreational players from Europe were in the 20s, that should have been one of the first things that was reviewed and stuck out.  Similarly if all juniors regardless of age were clumped together, or juniors relative to adults seemed off, this could have been observed far earlier and address before WTN was launched.

There are ways an algorithm can address this, notably by looking at the matches that are played between the islands of players, and giving them more weight in determining the relative ratings for the group of players they represent.

This is effectively what the USTA does with NTRP, using the matches from Nationals to "connect" the different sections, and have those results and how well a section did trickle down to players from the section.  If a section does very well at Nationals, that means relatively speaking the players from that section at that level were rated too low relative to other sections, so player's ratings may go up and more bump ups will occur.  It can go the other way too, and I've pointed out the effect of this in many of my year-end rating analysis'.

I'm guessing the WTN doesn't inherently do this as part of the algorithm, as it seemingly required this significant adjustment to compensate for what they were observing.  Hopefully they still applied the principals of leveling across cohorts and it wasn't entirely manual.

Another challenge the WTN has, and one that doesn't seem to have been addressed yet and wasn't mentioned in the FAQ is gender neutrality.  The algorithm claims to be gender neutral, i.e. just like a WTN 25 male should have a competitive match with another WTN 25 male, either should also have a competitive match with a WTN 25 female.

It is generally accepted that NTRP which is not gender neutral has about a one level (0.5 NTRP) difference between the genders, e.g. a 3.5 male and 4.0 female will be similarly skilled, but looking at WTN ratings one will regularly see men and women of the same NTRP level with similar WTN ratings or even the women having lower (better) WTN ratings.  Similarly, you will see men with higher (better) NTRP levels having higher (worse) WTN ratings.

This means WTN really can't be used to get cross-gender competitive matches today.  And WTN is really handcuffed now when it gets to trying to solve this as it now has to fit both men and women into this compressed 25-40 range I noted above.

For example, looking at men's WTN ratings, leaving room for the 2.5 and 3.0 women to have higher WTN ratings than 3.0 men, and leaving some room for complete beginners not even playing matches yet, the WTN range all recreational men have to fit into is more like 25-35 at best, and perhaps even less.

The change made to WTN may very well make it better than before.  But the challenges of truly being a world tennis number are still there.  It will be interesting to see how it continues to mature.

Thursday, July 27, 2023

Has the ITF updated the WTN algorithm? Is it any better?

I received an e-mail today from the USTA titled "Now Here: Updates to ITF World Tennis Number" and it goes on to describe the ITF on-boarding several new federations, but also says the ITF has taken the opportunity to "enhance the ITF WTN calculation" and as a result player's WTNs may change.

I've written about the WTN before (see a list of all WTN related posts here) and done some analysis (men's singles, women's singles, men's doubles, women's doubles) that raised some questions about its accuracy.  Since then, I haven't written that often as 1) the ITF/USTA have made it difficult to gather data to do analysis and 2) casual monitoring showed the same initial observations still stood.  This new announcement may cause me to revisit things, although it may be hard to do so broadly.

An initial/personal look though reveals there have been some big changes!

I'd had a WTNs of around 13 forever based on singles play from years ago, and checking today, I see my WTNs is now 26.9!  I'd say that is a change.

In doubles, I've been around a 22 WTNd and today, 29.2!

Whoa, overnight I've gotten a lot worse!

For reference, I'm a middling 4.5 playing primarily (only lately) doubles with a few wins in men's this year but more losses, although several were in match tie-breaks.  Even if NTRP were to bump me down to 4.0 at year-end, the new WTN algorithm seems to be saying that I as a 4.0/4.5 male is a 26.9 WTNd.

If this is done across the board and it is consistent, perhaps it is better, relatively at least, but on the surface the mapping between NTRP and WTN seems even worse than before now.  A 4.5 male at 26.9 doesn't leave a lot of room for the 2.5-4.0 players, and there is a lot of room for the 4.5-7.0 players.

Looking at some players around me, I've recently been playing 8.0 mixed with 3.5 women partners, and have done better (4-1 this season) and WTNd had my partners similarly rated to me, around 21-24.  That screams that something is wrong with the gender neutral part of WTN, and this algorithm change as my 3.5 women partners are still similarly rated to me, 28-32 or so.

I'm sure there will be naysayers out there who say I must be playing like a 3.5 woman, perhaps that is true, but then the folks I've beat must be at that level or worse.

So there seems to have been a blanket shift up numerically (down in rating strength as WTN has 1 as the best and 40 as the worst) to give more room for high rated players, but no significant change to making gender neutral relativity more accurate.

I'll try to do some more analysis later, but what are you seeing?  What do you think?  Is WTN even something relevant to you in any way at this point?

Thursday, July 6, 2023

Another look at what NTRP rating combinations works best in Mixed 18 & Over? More Interesting Tennis League Stats

I've been looking at how different combinations of levels do in the 55 & Over Adult and 18 & Over Mixed, and just took a deeper dive on 55 & Over, so the same deeper dive on Mixed is in order.

The general observation has been that unbalanced pairs, e.g. a 4.5 and 3.5 in an 8.0 flight, tend to win more often than not, and this deeper dive will take a look at if this is actually expected or not based on the current ratings of the players.

The deeper dive on 55 & Over revealed that while the unbalanced pairs do win more often, they are always expected to, and often fall short of the expected winning percentage.  This may indicate a selection bias of sorts, where it is possible that only stronger players at their level opt to play as an unbalanced pair, thus skewing the stats in their favor, and there is not some inherent advantage to unbalanced pairs.

So on to taking a closer look at Mixed.  Specifically, I narrowed the matches analyzed to only those where I had an established rating for all 4 players and then looked at not just how often unbalanced pairs won, but how often they were expected to win.  The results were certainly interesting, and a bit different from what we saw for 55 & Over.

Note, the actual winning percentages are slightly different from the earlier post because some matches where the players didn't all have ratings have been filtered out.

For each level/gender below, I'm using 2022 18 & Over Mixed league data again and showing the percentage of the time the unbalanced pair won, followed by the percentage of the time the unbalanced pair had the higher combined ratings.  When the actual is higher than the expected, the unbalanced is overachieving, and when the actual is lower than the expected, they are underachieving.

The ratings used for this analysis are my Mixed Exclusive Estimated Dynamic NTRP Ratings I use for doing the various individual, team, and flight reports I offer.  The pairings are showing the male then female levels.

Starting with 6.0:

  • 3.5/2.5 vs 3.0/3.0 - 52.0% actual, 57.1% expected - underachieved
  • 3.5/2.5 vs 2.5/3.5 - 53.7% actual, 41.1% expected - overachieved big
  • 2.5/3.5 vs 3.0/3.0 - 48.0% actual, 59.3% expected - underachieved big

The 6.0 level can be tricky with a lot of new/unknown players at times, but we see the pairs with the higher rated male player win more often, and this is especially true when it is a 3.5 male vs a 2.5 male and the 3.5 male really overachieves vs expected.

But like I said, 6.0 is tricky, so on to 7.0 where players are a little less volatile:

  • 4.0/3.0 vs 3.5/3.5 - 58.7% actual, 56.4% expected - overachieved a bit
  • 4.0/3.0 vs 3.0/4.0 - 56.3% actual, 51.7% expected - overachieved
  • 3.0/4.0 vs 3.5/3.5 - 49.7% actual, 53.3% expected - underachieved

We again see the pair with the higher rated male wins more often, and we also see some over achievement for the 4.0/3.0 pairs.

Next, 8.0:

  • 4.5/3.5 vs 4.0/4.0 - 57.8% actual, 54.9% expected - overachieved
  • 4.5/3.5 vs 3.5/4.5 - 52.4% actual, 48.9% expected - overachieved
  • 3.5/4.5 vs 4.0/4.0 - 54.2% actual, 53.2% expected - overachieved a bit

The trend of unbalanced pairs overachieving continues by a decent 3-4% margin for the 4.5/3.5 pairs.

Next, 9.0:

  • 5.0/4.0 vs 4.5/4.5 - 61.6% actual, 53.6% expected - overachieved big
  • 5.0/4.0 vs 4.0/5.0 - 59.1% actual, 55.0% expected - overachieved
  • 4.0/5.0 vs 4.5/4.5 - 56.6% actual, 54.6% expected - overachieved a bit

The over achievement for unbalanced pairs is seen here too.

The counts for 10.0 are so low there is no point to doing the analysis here.

Given we see some to significant over achievement for the unbalanced pairs, especially at the higher levels, it may be that there is an inherent advantage in Mixed.  This is perhaps because at higher levels, players are able to employ strategies maximizing the difference in abilities taking advantage of having the stronger player.

It could also be that for the lower level player of unbalanced pairs, only the top players at that level self-select to play, and that gives an inherent ratings advantage over two "average" players on a balanced pair.

What do you think?

Wednesday, July 5, 2023

Another look at what NTRP rating combinations works best in Adult 55 & Over? More Interesting Tennis League Stats

A few days ago I took at look at what ratings combinations win the most in 55 & Over Adult league play.  What I found was consistent with earlier analysis I've done and that is that unbalanced pairs, e.g. a 4.0/3.0 pair in 7.0, beat balanced pairs, e.g. a 3.5/3.5, more than they lose.

What some pointed out, and I knew I wanted to do more research into, is that while the unbalanced pairs win more, that may be because of a selection bias of sorts, where it is possible that only stronger players at their level opt to play as an unbalanced pair, and the fact they win more often may be expected.  It is even possible that while they win more than lose, they may not win as often as expected.

For example, at 7.0, it may be that it is the strong 3.0s that are attracted to play or perhaps are recruited to play.  That means when they (say a 2.90 3.0) partner with an average 4.0 (3.75) vs two average 3.5s (3.25), the rating comparison is 6.65 vs 6.50, and the unbalanced pair has the higher combined ratings and thus is arguably expected to win.  Even if the 3.5s are a bit better (3.30) it is still 6.65 vs 6.60.

With that in mind, I went about digging deeper.  Specifically, I narrowed the matches analyzed to only those where I had an established rating for all 4 players and then looked at not just how often unbalanced pairs won, but how often they were expected to win.  The results were certainly interesting, if not a bit surprising.

Note, the actual winning percentages are slightly different from the earlier post because some matches where the players didn't all have ratings have been filtered out.

For each level/gender below, I'm using 2022 55 & Over Adult league data again and showing the percentage of the time the unbalanced pair won, followed by the percentage of the time the unbalanced pair had the higher combined ratings.  When the actual is higher than the expected, the unbalanced is overachieving, and when the actual is lower than the expected, they are underachieving.

The ratings used for this analysis are my Estimated Dynamic NTRP Ratings I use for doing the various individual, team, and flight reports I offer.

For the women:

  • 6.0 - 3.5/2.5 vs 3.0/3.0 - 48.7% actual, 54.7% expected - underachieved
  • 7.0 - 4.0/3.0 vs 3.5/3.5 - 54.6% actual, 54.5% expected - nearly perfectly as expected
  • 8.0 - 4.5/3.5 vs 4.0/4.0 - 56.2% actual, 62.3% expected - underachieved
  • 9.0 - 5.0/4.0 vs 4.5/4.5 - 66.9% actual, 73.7% expected - underachieved

Here we see the selection of matches change the actual percentages slightly, and at 6.0 the unbalanced actually lost slightly more than they won, but at three of the four levels, while the unbalanced won more, they were expected to win more than they did, so arguably underachieved.

For the men:

  • 6.0 - 3.5/2.5 vs 3.0/3.0 - 58.6% actual, 58.6% expected - exactly as expected
  • 7.0 - 4.0/3.0 vs 3.5/3.5 - 55.6% actual, 51.1% expected - overachieved
  • 8.0 - 4.5/3.5 vs 4.0/4.0 - 55.4% actual, 56.3% expected - slightly underachieved
  • 9.0 - 5.0/4.0 vs 4.5/4.5 - 50.9% actual, 55.2% expected - underachieved

Here the actual percentages vary a bit more with the selected matches, especially at 9.0.  And we also see that at 6.0 it is exactly as expected, the unbalanced overachieve at 7.0, but 8.0 has a slight underachieve and 9.0 a significant underachieve.

So, it may not be quite as clear cut as the earlier analysis showed.  While unbalanced pairs do win more than they lose, it appears it is because it is the right unbalanced pairs are playing and they are expected to win because they are higher rated overall, not just because they are an unbalanced pair.  And they may not even win as often as expected.

What do you think?  Is this more in-line with your experience than the initial analysis was?

Monday, July 3, 2023

What NTRP rating combinations works best in Mixed? More Interesting Tennis League Stats

I wrote yesterday about what ratings combinations work best for 55 & Over Adult play, today, I thought I'd start a refresh of the ratings combination analysis for Mixed I did around 8 years ago.

To start, I'll look at 18 & Over Mixed.  For reference, here is the post I did 8 years ago for the same division.

This time I will include 6.0 and 10.0 levels I omitted last time.  For each pairing I list, the first number will be the male level and the second the female level.

Starting with 6.0:

  • 3.5/2.5 pairs beat 3.0/3.0 pairs 52.7% of the time
  • 2.5/3.5 pairs beat 3.0/3.0 pairs 49.6% of the time
  • 3.5/2.5 pairs beat 2.5/3.5 pairs 56.2% of the time

Here, a slight to significant advantage to unbalanced when the male is the higher rated player.  The 2.5/3.5 is ever so slightly below 50%.

Next up, 7.0:

  • 4.0/3.0 pairs beat 3.5/3.5 pairs 58.3% of the time
  • 3.0/4.0 pairs beat 3.5/3.5 pairs 50.8% of the time
  • 4.0/3.0 pairs beat 3.0/4.0 pairs 55.1% of the time

This is similar to before (60%, 52%, 57%) so that form continues to hold.  Unbalanced with the higher rated male has the advantage, but unbalanced with the higher rated female has a slight advantage too.

Next up, 8.0:

  • 4.5/3.5 pairs beat 4.0/4.0 pairs 57.6% of the time
  • 3.5/4.5 pairs beat 4.0/4.0 pairs 53.5% of the time
  • 4.5/3.5 pairs beat 3.5/4.5 pairs 51.7% of the time

This is a bit different from last time (61%, 51%, 60%) with the unbalanced 3.5/4.5 pairs doing better than before.  Still, unbalanced has the advantage over balanced and unbalanced with the higher rated male stronger overall.

And 9.0:

  • 5.0/4.0 pairs beat 4.5/4.5 pairs 59.6% of the time
  • 4.0/5.0 pairs beat 4.5/4.5 pairs 56.1% of the time
  • 5.0/4.0 pairs beat 4.0/5.0 pairs 58.5% of the time
This compares with 57%, 58%, 51% last time, so a bit different, but the trend holds.  Unbalanced with more frequently, and the unbalanced with the higher rated male do so over the higher rated female.

Last, 10.0:

  • 5.5/4.5 pairs beat 5.0/5.0 pairs 27.0% of the time
  • 4.5/5.5 pairs beat 5.0/5.0 pairs 62.9% of the time
  • 5.5/4.5 pairs beat 4.5/5.5 pairs 66.7% of the time

Note there is so little play at this level that some of the totals for these match-ups are in the single digits so make of that what you will.

Aside from a couple anomalies at the 6.0 and 10.0 levels, the trend continues.  Unbalanced pairs win more often in Mixed.  The advantage is not huge, most of these winning percentages are between 50% and 60%, but it is consistent.

What do you think?  Is this consistent with what you see locally?

Sunday, July 2, 2023

What NTRP rating combinations works best in Adult 55 & Over? More Interesting Tennis League Stats

A popular discussion topic for USTA League teams that use combo-levels is what the best pairings are.  I've looked at this a number of years ago for Mixed, but thought it would be interesting to take a look at Adult 55 & Over.

The candidate leagues in this case are the 55 & Over Adult leagues with 6.0, 7.0, 8.0, and 9.0 flights where players up to a full two levels apart can play together.  For example, in 8.0 the playing combinations can be 3.5/4.5, 4.0/4.0, and 4.5/3.5.  The fundamental question is, when different pairings face each other, which wins more often?

Note for this analysis I'm only considering "at-level" pairings and not those where say a 3.5 and 4.0 play together as a "7.5" pair in an 8.0 flight.

I've done this before for Mixed, both 18 & Over and 40 & Over and in nearly every case, unbalanced pairs win more often than balanced pairs, generally between 52% and 60% depending on the exact scenario.  Mixed also has different genders to add to the permutations but the strongest tends to be unbalanced with the male the higher level player where that pair wins around 60% of the time.

What I did was to take a look at Adult 55 & Over matches from 2022 and broke it out by both gender and level.

For the women:

  • 6.0 - 3.5/2.5 pairs beat 3.0/3.0 pairs 51% of the time
  • 7.0 - 4.0/3.0 pairs beat 3.5/3.5 pairs 55% of the time
  • 8.0 - 4.5/3.5 pairs beat 4.0/4.0 pairs 57% of the time
  • 9.0 - 5.0/4.0 pairs beat 4.5/4.5 pairs 66% of the time

We see unbalanced pairs have the advantage, very slight at 6.0 but growing as the level goes up.

For the men:

  • 6.0 - 3.5/2.5 pairs beat 3.0/3.0 pairs 57% of the time
  • 7.0 - 4.0/3.0 pairs beat 3.5/3.5 pairs 56% of the time
  • 8.0 - 4.5/3.5 pairs beat 4.0/4.0 pairs 56% of the time
  • 9.0 - 5.0/4.0 pairs beat 4.5/4.5 pairs 57% of the time

The men are very consistent at 56-57% of the time the unbalanced pair wins.

This is pretty consistent with the Mixed results, unbalanced pairs win more often.  They theory is that the unbalanced pairs generally have the strongest player and if they employ a strategy to minimize the lower rated partner's weaknesses, the stronger player rises to the top.  At 55-57% for most scenarios it is obviously not a given that unbalanced pairs will win, but the trend is pretty clear.

What do you think?  Is this consistent with what you see in your leagues?


Update: I did a subsequent deeper dive looking at actual vs expected results here.