Skip to content

feat: add line-by-line mode as default, stream without loading files into memory#328

Merged
oriongonza merged 8 commits intomasterfrom
optimize
Feb 25, 2026
Merged

feat: add line-by-line mode as default, stream without loading files into memory#328
oriongonza merged 8 commits intomasterfrom
optimize

Conversation

@oriongonza
Copy link
Copy Markdown
Collaborator

@oriongonza oriongonza commented Feb 21, 2026

Implements line-by-line processing as the default mode, replacing the previous behavior of loading entire files into memory via mmap. Adds --across (-A) for the old whole-file behavior when needed.

Changes

  • Default mode is now line-by-line: processes files as a stream, never loading the whole thing into memory
  • New --across / -A flag: opt-in to the old whole-file mmap behavior, which is faster when memory isn't a concern
  • Chunked I/O: reads in 8KB chunks with a line buffer, so performance is good and memory use is bounded
  • Refactored codebase: split into sd (library) and sd-cli (binary) crates

Closes

Closes #96 — massive memory usage on large files
Closes #100 — stdin now streams, works with journalctl -f | sd ... and similar
Closes #154 — memory allocation failure on files too large to fit in RAM
Closes #286sd now streams by default; the documented caveat no longer applies
Closes #302 — output is emitted as lines are processed, not buffered until EOF

Also closes #290. I'm back.

Benchmarks

1M lines (~36MB), foo → qux:

Command Time
sd -A 'foo' 'qux' (across, whole-file) 33ms
sd 'foo' 'qux' (line-by-line, default) 106ms
sed s/foo/qux/g 120ms

Line-by-line is faster than sed while using a fraction of the memory.

@oriongonza oriongonza changed the title perf: optimize line-by-line mode with chunked reading feat: add line-by-line mode as default, stream without loading files into memory Feb 21, 2026
oriongonza and others added 7 commits February 25, 2026 12:08
This made no sense because we don't intend to ever release `sd` as a crate
Add a new processing mode that handles input line by line instead of
reading entire files into memory. This fixes several long-standing issues:
- OOM on large files (O(line_size) memory instead of O(file_size))
- stdin waits for EOF (output now flushed per line, enables streaming)
- `^` matches phantom empty line after trailing `\n`
- `\s+$` eats newlines because `\s` sees `\n` across line boundaries

The implementation strips `\n` before passing each line to the replacer,
then restores it, so regex never sees newline characters. Files without
trailing newlines are preserved as-is. In-place file modification uses
the same atomic temp-file-and-rename pattern as the existing code path.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Line-by-line processing is now the default behavior. This provides
better defaults for common use cases: lower memory usage, streaming
stdin output, and predictable regex anchor behavior.

For patterns that need to match across line boundaries (e.g. replacing
\n or multi-line patterns), use the new --across / -A flag which
restores the previous whole-file behavior.

Pre-validates all input files before modifying any, matching the
atomicity guarantees of the mmap-based code path.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add benchmark results comparing line-by-line (default) and across (-A)
modes on a 1M line (~36MB) test file:
- Line-by-line is ~2-3x slower than across mode for throughput
- Still faster than sed for regex replacements
- Memory usage: 3 MB (line-by-line) vs 74 MB (across)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Replace per-line read_until() calls with chunked reading (8KB chunks)
and a line buffer that spans chunk boundaries. This reduces syscall
overhead and improves CPU cache locality.

Benchmark results on 1M line file (~36MB):
- Before: 357ms (2.84x slower than across mode, slower than sed)
- After:  106ms (3.19x slower than across mode, 1.1x faster than sed)

The trade-off between modes is:
- Across mode: fastest (33ms), uses more memory (~74MB)
- Line-by-line: now much faster (106ms), bounded memory usage
- Line-by-line still respects memory limits for streaming use cases

fix build, tests, and lint regressions

remove file-mapping code paths and dependency
@oriongonza oriongonza merged commit 4a7b216 into master Feb 25, 2026
4 of 9 checks passed
@ofek
Copy link
Copy Markdown

ofek commented Feb 25, 2026

Awesome work!

@varenc
Copy link
Copy Markdown

varenc commented Feb 26, 2026

Thank you for this!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

3 participants