Skip to content

Conversation

@github-actions
Copy link
Contributor

@github-actions github-actions bot commented Jan 12, 2026

Cherry-picked from #59394

Note: This PR depends on #59766 (cherry-pick of #58545) being merged first.

Summary

Introduce lucene bool mode for search function.

Test plan

  • Regression tests (after dependency PR merged)

Related PRs: #59394
Depends on: #59766

@github-actions github-actions bot requested a review from yiguolei as a code owner January 12, 2026 02:25
@Thearas
Copy link
Contributor

Thearas commented Jan 12, 2026

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR.

Please clearly describe your PR:

  1. What problem was fixed (it's best to include specific error reporting information). How it was fixed.
  2. Which behaviors were modified. What was the previous behavior, what is it now, why was it modified, and what possible impacts might there be.
  3. What features were added. Why was this function added?
  4. Which code was refactored and why was this part of the code refactored?
  5. Which functions were optimized and what is the difference before and after the optimization?

@dataroaring dataroaring reopened this Jan 12, 2026
@Thearas
Copy link
Contributor

Thearas commented Jan 12, 2026

run buildall

@hello-stephen
Copy link
Contributor

Cloud UT Coverage Report

Increment line coverage 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 79.57% (1776/2232)
Line Coverage 64.89% (31462/48482)
Region Coverage 65.42% (15650/23923)
Branch Coverage 56.00% (8312/14842)

@hello-stephen
Copy link
Contributor

FE UT Coverage Report

Increment line coverage 55.80% (202/362) 🎉
Increment coverage report
Complete coverage report

@airborne12 airborne12 force-pushed the auto-pick-59394-branch-4.0 branch from 7b764b8 to 3e3d159 Compare January 12, 2026 08:02
@airborne12
Copy link
Member

run buildall

@doris-robot
Copy link

Cloud UT Coverage Report

Increment line coverage 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 79.57% (1776/2232)
Line Coverage 64.88% (31455/48482)
Region Coverage 65.44% (15656/23923)
Branch Coverage 55.99% (8310/14842)

)

### What problem does this PR solve?

Issue Number: close #xxx

Related PR: #58545

Problem Summary:

This PR introduces two new features for the SEARCH function:

#### 1. Lucene Boolean Mode

Adds a `mode` option to enable Lucene/Elasticsearch-style query parsing:

```sql
-- Enable Lucene mode via JSON options
SELECT * FROM docs WHERE search('apple AND banana',
  '{"default_field":"title","mode":"lucene"}');

-- With minimum_should_match
SELECT * FROM docs WHERE search('apple AND banana OR cherry',
  '{"default_field":"title","mode":"lucene","minimum_should_match":1}');
```

**Key differences from standard mode:**
- AND/OR/NOT work as left-to-right modifiers (not traditional boolean
algebra)
- Uses MUST/SHOULD/MUST_NOT internally (like Lucene's Occur enum)
- Pure NOT queries return empty results (need positive clause)

**Behavior comparison:**

| Query | Standard Mode | Lucene Mode |
|-------|--------------|-------------|
| `a AND b` | a ∩ b | +a +b (both MUST) |
| `a OR b` | a ∪ b | a b (both SHOULD, min=1) |
| `NOT a` | ¬a | Empty (no positive clause) |
| `a AND NOT b` | a ∩ ¬b | +a -b (MUST a, MUST_NOT b) |
| `a AND b OR c` | (a ∩ b) ∪ c | +a b c (only a is MUST) |

#### 2. Escape Characters in DSL

Support for escaping special characters using backslash:

| Escape | Description | Example |
|--------|-------------|---------|
| `\ ` | Literal space | `title:First\ Value` matches "First Value" |
| `\(` `\)` | Literal parentheses | `title:hello\(world\)` matches
"hello(world)" |
| `\:` | Literal colon | `title:key\:value` matches "key:value" |
| `\\` | Literal backslash | `title:path\\to\\file` matches
"path\to\file" |
@yiguolei yiguolei force-pushed the auto-pick-59394-branch-4.0 branch from 3e3d159 to 0dfdf53 Compare January 13, 2026 01:30
@yiguolei
Copy link
Contributor

run buildall

@doris-robot
Copy link

Cloud UT Coverage Report

Increment line coverage 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 79.57% (1776/2232)
Line Coverage 64.88% (31453/48482)
Region Coverage 65.43% (15654/23923)
Branch Coverage 55.98% (8309/14842)

@hello-stephen
Copy link
Contributor

FE UT Coverage Report

Increment line coverage 55.80% (202/362) 🎉
Increment coverage report
Complete coverage report

@hello-stephen
Copy link
Contributor

BE UT Coverage Report

Increment line coverage 44.44% (16/36) 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 53.46% (18869/35293)
Line Coverage 39.28% (174993/445531)
Region Coverage 33.98% (135388/398409)
Branch Coverage 34.91% (58483/167525)

@yiguolei yiguolei closed this Jan 13, 2026
@yiguolei yiguolei reopened this Jan 13, 2026
@yiguolei
Copy link
Contributor

run p0

@yiguolei
Copy link
Contributor

run nonConcurrent

@github-actions github-actions bot added the approved Indicates a PR has been approved by one committer. label Jan 14, 2026
@github-actions
Copy link
Contributor Author

PR approved by at least one committer and no changes requested.

@github-actions
Copy link
Contributor Author

PR approved by anyone and no changes requested.

@yiguolei yiguolei merged commit fef71d4 into branch-4.0 Jan 14, 2026
38 of 46 checks passed
@github-actions github-actions bot deleted the auto-pick-59394-branch-4.0 branch January 14, 2026 02:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by one committer. reviewed

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants