Fix(stats): Correct p-value calculation in permutation test by Geeks-Sid · Pull Request #8 · IUCompPath/PermRanker

Geeks-Sid · 2025-06-13T18:33:51Z

Description

This pull request addresses a bug in the perform_permutation_test method where the p-value was calculated using an incorrect one-sided test. This resulted in unreliable significance testing, potentially masking real differences between methods.

The Bug: Incorrect One-Sided Test Logic

The previous implementation calculated the observed difference in summed ranks as diff_ranks = arr_i.sum() - arr_j.sum() and then counted how many permuted differences were smaller (diff_ranks_rand < diff_ranks).

This logic only worked correctly if method_i was superior to method_j (resulting in a large negative diff_ranks). If method_j was superior to method_i, diff_ranks would be a large positive number. The condition diff_ranks_rand < diff_ranks would then be met by almost all permutations, leading to an erroneously high p-value (e.g., > 0.5) and a failure to detect a statistically significant difference.

The Fix: Implementing a Standard Two-Sided Test

The permutation test has been corrected to use a standard two-sided test, which is the correct approach for determining if two methods are significantly different, regardless of the direction of the difference.

The changes are as follows:

Use Absolute Differences: The test now compares the absolute magnitude of the observed difference with the absolute magnitude of the permuted differences.
- Before the loop: observed_diff = abs(arr_i.sum() - arr_j.sum())
- Inside the loop: if abs(permuted_diff) >= observed_diff:
Symmetric P-value Matrix: The p-value matrix (self.pvals) is now correctly populated to be symmetric (self.pvals[i, j] = self.pvals[j, i]), as the significance of the difference between method A and B is identical to that between B and A.
Improved Initialization: The self.pvals matrix is now initialized with np.ones(), which is more semantically correct, as the default state (including the diagonal) represents no significant difference (p=1.0).

Geeks-Sid · 2025-07-31T19:53:45Z

This code should be good to go and can be tested under normal tests.

Copilot

Pull Request Overview

This PR fixes a critical bug in the perform_permutation_test method where the p-value calculation used an incorrect one-sided test that could fail to detect statistically significant differences when the second method outperformed the first.

Corrects the permutation test to use a proper two-sided test by comparing absolute differences
Ensures the p-value matrix is symmetric as it should be for pairwise comparisons
Improves code efficiency with boolean indexing and better variable management

pyranker/ranker.py

Update ranker.py fixing bug for extremes

230c67b

Geeks-Sid marked this pull request as ready for review July 31, 2025 19:53

Geeks-Sid requested a review from sarthakpati as a code owner July 31, 2025 19:53

Geeks-Sid mentioned this pull request Jul 31, 2025

count_extreme is no longer zero #11

Merged

sarthakpati requested a review from Copilot July 31, 2025 20:22

Copilot AI reviewed Jul 31, 2025

View reviewed changes

pyranker/ranker.py Show resolved Hide resolved

pyranker/ranker.py Show resolved Hide resolved

sarthakpati merged commit dc00ff4 into IUCompPath:main Jul 31, 2025
1 check failed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix(stats): Correct p-value calculation in permutation test#8

Fix(stats): Correct p-value calculation in permutation test#8
sarthakpati merged 1 commit intoIUCompPath:mainfrom
Geeks-Sid:main

Geeks-Sid commented Jun 13, 2025 •

edited

Loading

Uh oh!

Geeks-Sid commented Jul 31, 2025

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

Geeks-Sid commented Jun 13, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

The Bug: Incorrect One-Sided Test Logic

The Fix: Implementing a Standard Two-Sided Test

Uh oh!

Geeks-Sid commented Jul 31, 2025

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Geeks-Sid commented Jun 13, 2025 •

edited

Loading