Skip to content

Conversation

@Hongyang66666
Copy link

Proposed changes

Issue Number: close #58523

Problem

When upgrading Doris from older versions to newer versions, legacy data might contain empty jsonb values (where size is 0).

The current strict validation in checkAndCreateDocument throws an InvalidArgument error ("Invalid JSONB document: too small size(0)") when encountering such data. This causes critical failures during the compaction process.

Fix

Modify be/src/util/jsonb_document.h to add a compatibility check.
If size == 0, the function now returns Status::OK() (treating it as a valid empty/null object) instead of throwing an error. This ensures compaction and queries can proceed normally on legacy data.

Check List

  • The code has been compiled and tested locally.
  • This PR fixes a bug.
  • This PR refactors code.
  • This PR adds a new feature.

@hello-stephen
Copy link
Contributor

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR.

Please clearly describe your PR:

  1. What problem was fixed (it's best to include specific error reporting information). How it was fixed.
  2. Which behaviors were modified. What was the previous behavior, what is it now, why was it modified, and what possible impacts might there be.
  3. What features were added. Why was this function added?
  4. Which code was refactored and why was this part of the code refactored?
  5. Which functions were optimized and what is the difference before and after the optimization?

inline Status JsonbDocument::checkAndCreateDocument(const char* pb, size_t size,
JsonbDocument** doc) {
*doc = nullptr;
// Fix Issue #58523: Tolerate empty data from legacy versions, treat as NULL to avoid errors.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If return OK here, the other logic will be ok???
For example,
JsonbDocument* doc = nullptr;
THROW_IF_ERROR(JsonbDocument::checkAndCreateDocument(
result.writer->getOutput()->getBuffer(), result.writer->getOutput()->getSize(),
&doc));
result.value = doc->getValue();
I think it will core here.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@yiguolei Thanks for the review! You are absolutely right.
Returning OK with a nullptr doc will indeed cause a crash in the caller.
I have updated the patch. Now, if size == 0, I initialize a static valid Null JsonbDocument (using JsonbWriter to construct it once) and assign it to *doc. This ensures the caller gets a safe, valid object representing a JSON null.
Please verify the latest commit.

yiguolei
yiguolei previously approved these changes Dec 3, 2025
@yiguolei
Copy link
Contributor

yiguolei commented Dec 3, 2025

run buildall

@yiguolei
Copy link
Contributor

yiguolei commented Dec 3, 2025

LGTM

@yiguolei
Copy link
Contributor

yiguolei commented Dec 4, 2025

@Hongyang66666 compile failed, please check the code

Copy link
Author

@Hongyang66666 Hongyang66666 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @yiguolei @mrhhsg, thanks for the review.

I have addressed the potential crash issue.
Instead of returning nullptr, I now check if size == 0 and return a static valid Null JsonbDocument (constructed with {1, 0}).
This ensures *doc is always valid and safe to use by the caller.

Please review again, thanks!

// 手动构造一个合法的 Null 文档二进制数据
// 第1个字节:Version = 1 (JSONB_VER)
// 第2个字节:Type = 0 (T_Null)
static char s_null_buf[] = {1, 0};
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. 在beut 里增加一个case,验证一下 这种构造方式构造出的jsonbdocument 的version = 1 and type = T NULL。防止未来有人重构代码,把jsonb的bytes 结构改乱了。
  2. 在beut 中还要验证一下把这个null jsonb document 转成一个bytes 数组,然后可以checkAndCreateDocument 再创建出这个null jsonb document。

// Fix Issue #58523: Tolerate empty data from legacy versions.
// If size is 0, we return a static valid "Null" document.
if (size == 0) {
// 手动构造一个合法的 Null 文档二进制数据
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

所有的注释都得使用英文。

eldenmoon pushed a commit that referenced this pull request Dec 15, 2025
This PR fixes the issue that was supposed to be resolved by
#58656
. We need to address it as soon as possible, so I am submitting this PR.
github-actions bot pushed a commit that referenced this pull request Dec 15, 2025
This PR fixes the issue that was supposed to be resolved by
#58656
. We need to address it as soon as possible, so I am submitting this PR.
csun5285 added a commit to csun5285/doris that referenced this pull request Dec 16, 2025
…he#59007)

This PR fixes the issue that was supposed to be resolved by
apache#58656
. We need to address it as soon as possible, so I am submitting this PR.
@yiguolei yiguolei closed this Dec 17, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

dev/3.1.x dev/4.0.x p0_b usercase Important user case type label

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug] Invalid JSONB document: too small size(0) or null pointer

4 participants