Skip to content

fix(ansi): handle CRLF line endings in Text.from_ansi#4099

Closed
Alm0stSurely wants to merge 1 commit intoTextualize:masterfrom
Alm0stSurely:fix/crlf-empty-lines
Closed

fix(ansi): handle CRLF line endings in Text.from_ansi#4099
Alm0stSurely wants to merge 1 commit intoTextualize:masterfrom
Alm0stSurely:fix/crlf-empty-lines

Conversation

@Alm0stSurely
Copy link
Copy Markdown

Problem

Fixes #4090.

When Text.from_ansi is given input with CRLF (\r\n) line endings, all text content is lost and only empty lines remain.

Reproduction:

from rich.text import Text
Text.from_ansi("Hello\r\nWorld\r\n").plain
# Actual: '\n\n'
# Expected: 'Hello\nWorld\n'

Analysis

The root cause is an interaction between line splitting and carriage-return handling:

  1. AnsiDecoder.decode() splits on \n using re.split(r"(?<=\n)", ...), keeping the \n attached to the preceding line.
  2. After stripping \n, a line that originally ended with \r\n becomes ...\r.
  3. decode_line() then calls line.rsplit("\r", 1)[-1] to simulate terminal carriage-return behavior (where \r moves the cursor to the start of the line, effectively overwriting prior content).
  4. For a line ending in \r (from CRLF), rsplit("\r", 1)[-1] returns the empty string after the final \r, discarding all content.

Solution

Normalize \r\n to \n in decode() before splitting. This treats CRLF as a plain line-ending sequence rather than a terminal carriage-return that overwrites content.

The fix is a single line:

terminal_text = terminal_text.replace("\r\n", "\n")

This preserves the existing behavior of standalone \r (used in progress bars and terminal updates) while correctly handling CRLF line endings common on Windows and in network protocols.

Benchmarks / Verification

  • All 957 existing tests pass (25 skipped).
  • Added test_decode_crlf covering:
    • Simple CRLF: "Hello\r\nWorld\r\n"
    • Leading CRLF: "\r\nHello\r\n"
    • Consecutive CRLF: "Hello\r\n\r\nWorld"
    • Mixed line endings: "Hello\nWorld\r\n" and "Hello\r\nWorld\n"

Notes

  • This is a minimal, localized change with no API surface changes.
  • Standalone \r (without following \n) continues to work as before, since the replace only targets the \r\n sequence.

Fixes Textualize#4090

The AnsiDecoder.decode() method splits input on \n and then calls
decode_line() which strips \n and uses rsplit('\r', 1)[-1] to handle
carriage returns. When the input contains CRLF (\r\n), the split
preserves the \r in the line content, and then rsplit('\r', 1)[-1]
takes the empty string after the final \r, discarding all text.

This fix normalizes \r\n to \n before splitting, ensuring that
CRLF sequences are treated as plain line endings rather than
terminal carriage returns that overwrite content.

Added test_decode_crlf to cover CRLF, mixed line endings, and
consecutive CRLF sequences.
@willmcgugan
Copy link
Copy Markdown
Member

Your PR has been closed due to a AI policy violation.

Please read the following before submitting further PRs.

https://github.com/Textualize/textual/blob/main/AI_POLICY.md

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[BUG] Text.from_ansi leaves empty lines when input string has CRLF line endings

2 participants