Skip to content

Commit 701ca28

Browse files
authored
flate: reduce stateless allocations (#1106)
After updating GO to v1.24+, a sharp increase in CPU utilization was detected. Heap profile helped to reveal increased memory allocations by Write and Close methods of stateless gzip.Writer mode. This PR optimizes problem area by using sync.Pool and later allocation of tokens object. Benchmarks: BEFORE ``` BenchmarkEncodeDigitsSL1e4-12 10141 115946 ns/op 86.25 MB/s 542379 B/op 3 allocs/op BenchmarkEncodeDigitsSL1e5-12 1602 730674 ns/op 136.86 MB/s 541377 B/op 2 allocs/op BenchmarkEncodeDigitsSL1e6-12 175 6851506 ns/op 145.95 MB/s 541542 B/op 2 allocs/op BenchmarkEncodeTwainSL1e4-12 9708 131564 ns/op 76.01 MB/s 542146 B/op 3 allocs/op BenchmarkEncodeTwainSL1e5-12 1663 684854 ns/op 146.02 MB/s 541463 B/op 2 allocs/op BenchmarkEncodeTwainSL1e6-12 177 6435648 ns/op 155.38 MB/s 541654 B/op 2 allocs/op ``` AFTER ``` BenchmarkEncodeDigitsSL1e4-12 34747 33800 ns/op 295.86 MB/s 8 B/op 0 allocs/op BenchmarkEncodeDigitsSL1e5-12 1771 640723 ns/op 156.07 MB/s 160 B/op 0 allocs/op BenchmarkEncodeDigitsSL1e6-12 181 6759226 ns/op 147.95 MB/s 1573 B/op 0 allocs/op BenchmarkEncodeTwainSL1e4-12 35294 35304 ns/op 283.26 MB/s 8 B/op 0 allocs/op BenchmarkEncodeTwainSL1e5-12 1939 585755 ns/op 170.72 MB/s 146 B/op 0 allocs/op BenchmarkEncodeTwainSL1e6-12 181 6505389 ns/op 153.72 MB/s 1573 B/op 0 allocs/op ``` <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit - **Refactor** - Optimized compression internals to reuse buffers via pooling, improving throughput and reducing memory use during repeated operations. - Enhances performance and consistency for both dictionary and non-dictionary compression paths across large blocks. - No changes to public APIs or user-facing behavior; workflows remain the same. - Users may see faster compression and lower memory footprint under sustained/high-volume workloads. <!-- end of auto-generated comment: release notes by coderabbit.ai -->
1 parent e0b47ff commit 701ca28

File tree

1 file changed

+16
-4
lines changed

1 file changed

+16
-4
lines changed

flate/stateless.go

Lines changed: 16 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -61,13 +61,19 @@ var bitWriterPool = sync.Pool{
6161
},
6262
}
6363

64+
// tokensPool contains tokens struct objects that can be reused
65+
var tokensPool = sync.Pool{
66+
New: func() any {
67+
return &tokens{}
68+
},
69+
}
70+
6471
// StatelessDeflate allows compressing directly to a Writer without retaining state.
6572
// When returning everything will be flushed.
6673
// Up to 8KB of an optional dictionary can be given which is presumed to precede the block.
6774
// Longer dictionaries will be truncated and will still produce valid output.
6875
// Sending nil dictionary is perfectly fine.
6976
func StatelessDeflate(out io.Writer, in []byte, eof bool, dict []byte) error {
70-
var dst tokens
7177
bw := bitWriterPool.Get().(*huffmanBitWriter)
7278
bw.reset(out)
7379
defer func() {
@@ -91,6 +97,12 @@ func StatelessDeflate(out io.Writer, in []byte, eof bool, dict []byte) error {
9197
// For subsequent loops, keep shallow dict reference to avoid alloc+copy.
9298
var inDict []byte
9399

100+
dst := tokensPool.Get().(*tokens)
101+
dst.Reset()
102+
defer func() {
103+
tokensPool.Put(dst)
104+
}()
105+
94106
for len(in) > 0 {
95107
todo := in
96108
if len(inDict) > 0 {
@@ -113,9 +125,9 @@ func StatelessDeflate(out io.Writer, in []byte, eof bool, dict []byte) error {
113125
}
114126
// Compress
115127
if len(inDict) == 0 {
116-
statelessEnc(&dst, todo, int16(len(dict)))
128+
statelessEnc(dst, todo, int16(len(dict)))
117129
} else {
118-
statelessEnc(&dst, inDict[:maxStatelessDict+len(todo)], maxStatelessDict)
130+
statelessEnc(dst, inDict[:maxStatelessDict+len(todo)], maxStatelessDict)
119131
}
120132
isEof := eof && len(in) == 0
121133

@@ -129,7 +141,7 @@ func StatelessDeflate(out io.Writer, in []byte, eof bool, dict []byte) error {
129141
// If we removed less than 1/16th, huffman compress the block.
130142
bw.writeBlockHuff(isEof, uncompressed, len(in) == 0)
131143
} else {
132-
bw.writeBlockDynamic(&dst, isEof, uncompressed, len(in) == 0)
144+
bw.writeBlockDynamic(dst, isEof, uncompressed, len(in) == 0)
133145
}
134146
if len(in) > 0 {
135147
// Retain a dict if we have more

0 commit comments

Comments
 (0)