Omega Score Methodology

⚠️

Warning

The following content is technical and full of mathematical jargon.
You are not expected to follow every step - unless you’re into that sort of thing
(no judgement 🙃).

Proceed at your own risk.

But if you’re brave enough to dive in, I’d love your feedback and any ideas! 😊

Introduction

One of the key limitations of the WSDC point system is that it entirely overlooks contestants’ performances in non-final rounds. As a result, it fails to provide a meaningful measure of progress over time - particularly for those who do not consistently reach the final round, which, statistically, represents the majority of participants.

To address this issue, we propose an alternative scoring measure designed to offer a more comprehensive and fair evaluation of performance. Such a measure should account for several important factors: the type of competition, its tier (used here as a proxy for difficulty), the division, and the primary role of the participant. Importantly, preliminary round performance should be weighted more heavily in larger, more competitive events, while final placements should carry greater significance in smaller competitions where outperforming a limited pool of contestants is comparatively easier.

This principle aligns with the underlying logic of the WSDC system, which stratifies events into tiers and allocates points based on the size of the competition. Our proposed extension builds on this foundation by incorporating preliminary and semifinal performances, ultimately offering a more nuanced and comprehensive approach to measuring competitive success.

Overarching Idea

Notation/Term Meaning Example
Type Competition type. Strictly or Jack & Jill (J&J).
Tier Competition WSDC Tier. Depends on the number of contestants. 25 Novice leaders in the prelim fall under Tier 3.
Division Competition division. Newcomer, Novice, Inter, etc.
Role Competing role. Leader or follower.
\(r\) Competition round (type and division-specific), \(r\in\{prelim, semi, final\}\). J&J Novice prelim.
\(J^r\) The number of judges, excluding the chief judge (type, division, round and role-specific). Four leader-judging judges in the J&J Novice prelim.
\(N^r\) The number of unique contestants in round \(r\) (type, division and role-specific). 12 followers in the J&J Novice final, \(N^{final}=12\)
\(x_i^r\) The total number or marks translated to points of contestant \(i\) in round \(r\). John was marked Yes three times and No once in Intermediate J&J. His total score in this round is 30.

Set the total competition score (regardless of the event, type, tier, division or role) to be between 0 and 100, such that

  • 0 is achieved if contestant \(i\) scored the fewest points in the lowest round in the competition (e.g. was marked “No” by all judges in the prelim).
  • 100 is achieved if contestant \(i\) won the 1st place and achieved the maximum number of points in all previous rounds (e.g. was marked “Yes” by all judges in the prelim and semi).

Let \(\omega_{max}^r\) be the maximum contribution component in round \(r\). Note that sometimes there is not enough contestants to form the semi-final or even the prelim (e.g. All-Stars in a small event), so depending of the competition size we have that

\[\begin{align*} \begin{cases} \textbf{Case 1: } &\omega_{max}^{prelim}+\omega_{max}^{semi}+\omega_{max}^{final}=100\\ \textbf{Case 2: } &\omega_{max}^{prelim}+\omega_{max}^{final}=100\\ \textbf{Case 3: } &\omega_{max}^{final}=100 \end{cases} \end{align*}\]

\(\omega_{max}^r\) is the maximum number of points achievable in a given round (e.g. 4 out of 4 judges marked you “Yes” in the prelim). It is type, tier, division and role-specific to accounts for competition type, size, level and imbalanced role-wise competitor entries.

We want to map a competitor’s total score in round \(r\), \(x_i^r\), to \(\omega_i^r\) which lies between 0 and \(\omega_{max}^r\). That allows us to write

\[\begin{align*} \begin{cases} \textbf{Case 1: } &0\leq\omega_{i}^{prelim}+\mathbb{1}(i\in semi)\cdot\omega_{i}^{semi}+\mathbb{1}(i\in final)\cdot\omega_{i}^{final}\leq 100\\ \textbf{Case 2: } &0\leq\omega_{i}^{prelim}+\mathbb{1}(i\in final)\cdot\omega_{i}^{final}\leq 100\\ \textbf{Case 3: } &0\leq\omega_{i}^{final}\leq 100 \end{cases} \end{align*}\]

where

\[\begin{align*} \mathbb{1}(i\in r)=\begin{cases} 1&\text{if contestant } i\text{ qualified to round }r\\ 0&\text{otherwise.} \end{cases} \end{align*}\]

We can simplify the above by writing

\[\begin{align*} \begin{cases} \textbf{Case 1: } &0\leq\Omega_{i}^{prelim}+\Omega_{i}^{semi}+\Omega_{i}^{final}\leq 100\\ \textbf{Case 2: } &0\leq\Omega_{i}^{prelim}+\Omega_{i}^{final}\leq 100\\ \textbf{Case 3: } &0\leq\Omega_{i}^{final}\leq 100 \end{cases} \end{align*}\]

and defining the final score of contestant’s \(i\) as

\[\begin{align*} \Omega_i = \sum_{r}\Omega_{i}^r. \end{align*}\]

There are two fundamental technical challenges in the formulation above:

  1. Estimating \(\omega_{max}^r\).
  2. Mapping \(x_i^r\) onto \(\omega_i^r\).

We will tackle them one by one now.

Round Participation

Fix \(i\) and the competition characteristics (type, division and role). We suggest defining \(\omega_{max}^r\) as

\[\begin{align*} \omega_{max}^r=100\cdot\frac{N^r}{\sum_r N^r}. \end{align*}\]

This yields an intuitive interpretation: a round-relative measure of competitive participation. It is specific to each competition and appropriately reflects its size, as intended. Importantly, it varies by division and role, since the number of contestants differs across these categories - thereby naturally capturing variability along these dimensions. In the case of Strictly divisions, where roles are always balanced, the metric still accounts for fluctuations in the total number of couples. Additionally, we observe that the sum across rounds satisfies \(\sum_r \omega_{max}^r=100\), as previously required.

The primary limitation of this metric is that, in competitions featuring all rounds, a contestant who qualifies for the final contributes to the total multiple times - once for each round - potentially inflating the aggregate measure. However, alternative approaches, such as those based on survival rates, do not permit a meaningful estimation of a contestant’s contribution in the preliminary round. Given this trade-off, we acknowledge the limitation but consider it outweighed by the overall advantages of the proposed metric.

Result Mapping

Note that in any non-final round, the minimum and maximum number of points a competitor can receive are 0 and \(10J^r\), respectively. These correspond to the scenarios where all role-specific judges mark the competitor as “No” or “Yes”, respectively. There are two possible approaches to scoring in round \(r\):

  • Relative to other competitors: Evaluate each competitor’s performance against the highest and lowest scores achieved by others in the same round.
  • Relative to the round scale: Evaluate each competitor’s score against the minimum and maximum theoretically achievable scores in that round.

While both approaches are valid, only the second one satisfies the requirement established earlier: that the full 100 points should be awarded only if a competitor wins the competition and receives “Yes” from all judges in all rounds - i.e., they achieve the round-specific maximum each time.

For example, a competitor might receive all “Yes” marks except one “Alt1” in the prelim round. Under the first approach, they could still be considered the best performer, as their score exceeds that of all others. However, under the second approach, they fall 5 points short of the round’s maximum, since one judge marked them “Alt1” instead of “Yes”.

As we will see, the second approach is actually a special case of the first. We begin by exploring it first. Let us define the actual minimum and maximum scores observed in a given round as

\[\begin{align*} m^r&=\mathrm{min}\left\{x_1^r, x_2^r, \dots, x_N^r\right\}\\ M^r&=\mathrm{max}\left\{x_1^r, x_2^r, \dots, x_N^r\right\}. \end{align*}\]

Then the solution to the second problem can be written as a linear mapping such that

\[\begin{align*} x_i^r\in[m^r, M^r]\to \omega_i^r \in[0, \omega_{max}^r]. \end{align*}\]

We want this mapping to preserve the respective point differences between contestants. This can be achieved by a min-max normalisation:

\[\begin{align*} \omega_i^r=\frac{(x_i^r-m^r)\cdot\omega_{max}^r}{M^r-m^r}\, \left[=\left(\frac{\omega_{max}^r}{M^r-m^r}\right)\cdot x_i^r-\frac{m^r\omega_{max}^r}{M^r-m^r}=ax_i^r+b\right]. \end{align*}\]

We obtain the second approach by substituting \(m^r=0\) and \(M^r=10J^r\). The expression above simplifies to

\[\begin{align*} \omega_{i}^r =\omega_{max}^r\cdot\frac{x_i^r}{10J^r}. \end{align*}\]

Mapping in the Final

In this context, relative placement, competition size, and the number of judges are already incorporated into the WSDC point system - so it is both logical and efficient to leverage it directly. The overarching idea remains the same: rescale the WSDC final points to fall within the range \([0, \omega_{max}^{final}]\), which depends on the competition tier.

Let \(f_i^{n,p}\) denote the number of WSDC points awarded in the final round \((f)\) for tier \(n\in\{1, 2,3,4,5\}\) and placement \(p\in\{1,2,\dots, P\}\), where \(P\) is the number of couples in the final. For example, \(f_i^{3,4}=4\), meaning a 4th place finish in a tier 3 event yields 4 WSDC points.

From this, we can readily determine \(M^{final}\), the maximum score, for each tier. However, identifying the minimum score \(m^{final}\) depends on the number of couples in the final. Once both bounds are established, we can proceed with rescaling as before.

\[\begin{align*} \omega_i^{final} = \frac{(f_i^{n,p}-m^{final})\cdot\omega_{max}^{final}}{M^{final}-m^{final}}. \end{align*}\]

Again, in the special case \(m^{final}=0\), and the expression above simplifies to

\[\begin{align*} \omega_i^{final} = \omega_{max}^{final}\cdot\frac{f_i^{n,p}}{M^{final}}. \end{align*}\]

Final Score

Combining the results above we have that

\[\begin{align*} \Omega_i &= \sum_{r}\Omega_i^r\\ &=\sum_{r}\mathbb{1}(i\in r)\omega_i^r\\ &=\sum_{r\neq final} \mathbb{1}(i\in r)\omega_{max}^{r}\cdot\frac{x_i^r-m^{r}}{M^{r}-m^{r}}+\mathbb{1}(i\in final)\omega_{max}^{final}\cdot\frac{f_i^{n,p}-m^{final}}{M^{final}-m^{final}}\\ &=\sum_{r\neq final} \mathbb{1}(i\in r)\left(\frac{100N^r}{\sum_r N^r} \right)\cdot\left(\frac{x_i^r-m^{r}}{M^{r}-m^{r}}\right)+\mathbb{1}(i\in final)\left(\frac{100N^{final}}{\sum_r N^r} \right)\cdot\left(\frac{f_i^{n,p}-m^{final}}{M^{final}-m^{final}}\right) \end{align*}\]

which in the special case simplifies to

\[\begin{align*} \Omega_i =\sum_{r\neq\{final\}} \mathbb{1}(i\in r)\left( \frac{100N^r}{\sum_r N^r} \right)\left(\frac{x_i^r}{10J^{r}}\right) + \mathbb{1}(i\in final)\left(\frac{100N^{final}}{\sum_r N^r} \right)\left(\frac{f_i^{n,p}}{M^{final}}\right) \end{align*}\]

Note that \(J^{r}\) is typically the same across non-final rounds. However, by keeping it explicit in the formula above, we allow for the possibility that the number of role-specific judges may vary between rounds.

Future Consideration

Instead of requiring competitors to calculate competitive participation themselves, we can estimate the weights by tier and role using publicly available data—specifically, empirical averages. In this approach, \(\hat\omega_{max}^r\) represents the average competitive participation across all available competitions, accounting for competition type, division, and role.

\[\begin{align*} \omega_{max}^r\approx \hat\omega_{max}^r. \end{align*}\]

Then, we would require additional linear mapping such that

\[\begin{align*} \omega_i^r\in[0,\omega_{max}^r]\to\hat\omega_i^r\in[0, \hat\omega_{max}^r]. \end{align*}\]

Let’s consider the advantages and disadvantages of this method:

  • Improved generalisability: While less flexible, this approach may generalise better. The current methodology is competition-specific, whereas using global averages could facilitate more meaningful comparisons across competitions and over time.
  • Simpler to implement: It is easier to compute overall, though selecting the appropriate values introduces the risk of user error.
  • Alignment with existing systems: This method aligns well with the current WSDC point tier structure.
  • Potential reliability issues: It may be unreliable in cases where data is sparse—for instance, Masters-level competitions are comparatively rare and may not provide robust averages.

Relative Percentile Score

An alternative method for scoring competition performance is to calculate each competitor’s relative ranking among all participants. Under this approach, all marks across rounds (translated into points), along with WSDC points awarded in the final (or relative placement in the case of a tie), are summed to determine an overall score. Competitors are then ranked based on these totals. However, this method has several notable drawbacks:

  • Handling ties: Additional logic is required to properly account for draws, which adds complexity.
  • Data demands: It is computationally intensive and time-consuming, as it requires complete data on all competitors (competition type, division, and role-specific) in order to establish rankings.

In particular, the need for full competition data makes this method impractical without dedicated technical infrastructure. It is, however, simple to understand in principle.