fix: handle small max_templated_field_length safely#59882
fix: handle small max_templated_field_length safely#59882Jeevankumar-s wants to merge 4 commits intoapache:mainfrom
Conversation
|
Congratulations on your first Pull Request and welcome to the Apache Airflow community! If you have any issues or are unsure about any anything please check our Contributors' Guide (https://github.com/apache/airflow/blob/main/contributing-docs/README.rst)
|
|
LGTM. @amoghrajesh ? |
amoghrajesh
left a comment
There was a problem hiding this comment.
This change also needs to go in https://github.com/astronomer/airflow/blob/main/task-sdk/src/airflow/sdk/execution_time/task_runner.py#L830-L901 which is where things happen at runtime for a task
| def _truncate_rendered_value(rendered: str, max_length: int) -> str: | ||
| if max_length <= 0: | ||
| return "" | ||
|
|
||
| prefix = "Truncated. You can change this behaviour in [core]max_templated_field_length. " | ||
| suffix = "..." | ||
|
|
||
| if max_length <= len(prefix): | ||
| return rendered[:max_length] | ||
|
|
||
| available = max_length - len(prefix) - len(suffix) | ||
| if available <= 0: | ||
| return rendered[:max_length] | ||
|
|
||
| return f"{prefix}{rendered[:available]!r}{suffix}" |
There was a problem hiding this comment.
While this is good, from a user perspective, if max_length is set to 1, it would just display rendered[0]. I think from a user perspective, we should prioritise communication to user over "correctness". Or the user wont know what happened.
Always better to present problem + solution vs misleading partial result ig
There was a problem hiding this comment.
Thanks for pointing this out.
I agree that showing just a single character can be confusing from a user perspective.
I’ll update the logic so that, even for very small max_templated_field_length values, we prioritize clearly communicating that truncation has occurred rather than returning a misleading partial result.
I’ll also apply the same change in the runtime path (task_runner.py) so the behavior stays consistent.
I’ll push an update shortly.
amoghrajesh
left a comment
There was a problem hiding this comment.
I have some concerns with this implementation, check review comments for details.
| def _truncate_rendered_value(rendered: str, max_length: int) -> str: | ||
| if max_length <= 0: | ||
| return "" | ||
|
|
||
| prefix = "Truncated. You can change this behaviour in [core]max_templated_field_length. " | ||
| suffix = "..." | ||
|
|
||
| if max_length <= len(suffix): | ||
| return suffix[:max_length] | ||
|
|
||
| if max_length <= len(prefix) + len(suffix): | ||
| return (prefix + suffix)[:max_length] | ||
|
|
||
| available = max_length - len(prefix) - len(suffix) | ||
| return f"{prefix}{rendered[:available]}{suffix}" | ||
|
|
||
|
|
||
|
|
||
| def _safe_truncate_rendered_value(rendered: Any, max_length: int) -> str: | ||
| return _truncate_rendered_value(str(rendered), max_length) |
There was a problem hiding this comment.
This still doesn't prioritise message first. Take these test cases for example:
[(1, 'test', 'Minimum value'),
(3, 'test', 'At ellipsis length'),
(5, 'test', 'Very small'),
(10, 'password123', 'Small'),
(20, 'secret_value', 'Small with content'),
(50, 'This is a test string', 'Medium'),
(83, 'Hello World', 'At prefix+suffix boundary v1'),
(84, 'Hello World', 'Just above boundary v1'),
(86, 'Hello World', 'At overhead boundary v2'),
(90, 'short', 'Normal case - short string'),
(100, 'This is a longer string', 'Normal case'),
(150,
'xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx',
'Large max_length, long string'),
(100, 'None', "String 'None'"),
(100, 'True', "String 'True'"),
(100, "{'key': 'value'}", 'Dict-like string'),
(100, "test's", 'String with apostrophe'),
(90, '"quoted"', 'String with quotes')]This is what your function returns:
for max_length, rendered, description in test_cases:
v1_result = truncate_v1(rendered, max_length)
print(v1_result)
.
...
Trunc
Truncated.
Truncated. You can c
Truncated. You can change this behaviour in [core]
Truncated. You can change this behaviour in [core]max_templated_field_length. He...
Truncated. You can change this behaviour in [core]max_templated_field_length. Hel...
Truncated. You can change this behaviour in [core]max_templated_field_length. Hello...
Truncated. You can change this behaviour in [core]max_templated_field_length. short...
Truncated. You can change this behaviour in [core]max_templated_field_length. This is a longer st...
Truncated. You can change this behaviour in [core]max_templated_field_length. xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx...
Truncated. You can change this behaviour in [core]max_templated_field_length. None...
Truncated. You can change this behaviour in [core]max_templated_field_length. True...
Truncated. You can change this behaviour in [core]max_templated_field_length. {'key': 'value'}...
Truncated. You can change this behaviour in [core]max_templated_field_length. test's...
Truncated. You can change this behaviour in [core]max_templated_field_length. "quoted"...This is what I would expect:
Truncated. You can change this behaviour in [core]max_templated_field_length. ...
Truncated. You can change this behaviour in [core]max_templated_field_length. ...
Truncated. You can change this behaviour in [core]max_templated_field_length. ...
Truncated. You can change this behaviour in [core]max_templated_field_length. ...
Truncated. You can change this behaviour in [core]max_templated_field_length. ...
Truncated. You can change this behaviour in [core]max_templated_field_length. ...
Truncated. You can change this behaviour in [core]max_templated_field_length. ...
Truncated. You can change this behaviour in [core]max_templated_field_length. ...
Truncated. You can change this behaviour in [core]max_templated_field_length. 'He'...
Truncated. You can change this behaviour in [core]max_templated_field_length. 'short'...
Truncated. You can change this behaviour in [core]max_templated_field_length. 'This is a longer'...
Truncated. You can change this behaviour in [core]max_templated_field_length. 'xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx'...
Truncated. You can change this behaviour in [core]max_templated_field_length. 'None'...
Truncated. You can change this behaviour in [core]max_templated_field_length. 'True'...
Truncated. You can change this behaviour in [core]max_templated_field_length. "{'key': 'value'}"...
Truncated. You can change this behaviour in [core]max_templated_field_length. "test's"...
Truncated. You can change this behaviour in [core]max_templated_field_length. '"quote'... There was a problem hiding this comment.
Thanks for the examples. I agree and have updated the logic to always prioritise the truncation message and avoid partial prefix or value output. The results now match the expected outputs you shared.
|
@Jeevankumar-s lot of test failures, could you adhere to them before I can look? |
Stopped working on this PR since #59999 is actively addressing the issue. |
Fixes incorrect truncation when [core] max_templated_field_length is set to very small values.
Previously, negative slicing could result in empty or malformed output.
This change ensures truncation always respects the configured maximum length.
Related: #59877