Skip to content

Ignore first phasing bit in process_gt_to_hap / process_gt_to_hap2#2443

Merged
pd3 merged 1 commit intosamtools:developfrom
daviesrob:hap-ignore-phase-prefix
Aug 12, 2025
Merged

Ignore first phasing bit in process_gt_to_hap / process_gt_to_hap2#2443
pd3 merged 1 commit intosamtools:developfrom
daviesrob:hap-ignore-phase-prefix

Conversation

@daviesrob
Copy link
Member

HTSlib version 1.22 introduced prefixed phasing support for VCF files with version 4.4 or later. Prior to this the first GT phase (when reading VCF) was always zero; after it is set either explicitly or when all other alleles are phased. This updates process_gt_to_hap() and process_gt_to_hap2() to ignore the first phase bit, removing an assumption it is always zero.

Fixes incorrect reporting of phase when using HTSlib 1.22 to read VCF files with version 4.4 or 4.5.

To reproduce the issue:

./bcftools view -Oz -o /tmp/convert.vcf.gz test/convert.vcf
sed 's/fileformat=VCFv4\.1/fileformat=VCFv4.4/' test/convert.vcf | ./bcftools view -Oz -o /tmp/convert44.vcf.gz
./bcftools convert -h -,.,. /tmp/convert.vcf.gz 2> /dev/null | head -n 1
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
./bcftools convert -h -,.,. /tmp/convert44.vcf.gz 2> /dev/null | head -n 1
0* 0* 0* 0* 0* 0* 0* 0* 0* 0* 0* 0* 0* 0* 0* 0* 0* 0* 0* 0*

After applying this patch:

./bcftools convert -h -,.,. /tmp/convert44.vcf.gz 2> /dev/null | head -n 1
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

Pull request samtools/htslib#1938 will change HTSlib to set the first phase bit for all VCF files irrespective of version, after which this problem will become rather more obvious. It would be useful to merge this change before the HTSlib one goes in.

HTSlib version 1.22 introduced prefixed phasing support for VCF
files with version 4.4 or later.  Prior to this the first GT phase
(when reading VCF) was always zero; after it is set either
explicitly or when all other alleles are phased.  This updates
process_gt_to_hap() and process_gt_to_hap2() to ignore the first
phase bit, removing an assumption it is always zero.

Fixes incorrect reporting of phase when using HTSlib 1.22 to read
VCF files with version 4.4 or 4.5.
@pd3 pd3 merged commit d648dd4 into samtools:develop Aug 12, 2025
8 checks passed
@daviesrob daviesrob deleted the hap-ignore-phase-prefix branch August 14, 2025 09:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants