Skip to content

Conversation

@aalva500-prog
Copy link
Contributor

@aalva500-prog aalva500-prog commented Sep 30, 2025

Description

Currently, SQL/PPL Count Query Is Maxed at MAX_INT

For example, in the following query requestId is capped at the value of max(integer), however, the actual number of requestId could be higher:

{
"query": ""source = accounts | stats count(requestId)""
}
{
  "schema": [
    {
      "name": "count(requestId)",
      "type": "integer"
    }
  ],
  "datarows": [
    [
      2147483647
    ]
  ],
  "total": 1,
  "size": 1
}

This PR addresses the above issue by changing the return data type of the count() function from integer to long, so the new result will be as follows:

{
"query": ""source = accounts | stats count(requestId)""
}
{
  "schema": [
    {
      "name": "count(requestId)",
      "type": "bigint"
    }
  ],
  "datarows": [
    [
      2147483647
    ]
  ],
  "total": 1,
  "size": 1
}

Related Issues

Resolves #[Issue number to be closed when this PR is merged]

Check List

  • New functionality includes testing.
  • New functionality has been documented.
  • New functionality has javadoc added.
  • New functionality has a user manual doc added.
  • New PPL command checklist all confirmed.
  • API changes companion pull request created.
  • Commits are signed per the DCO using --signoff or -s.
  • Public documentation issue/PR created.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

Signed-off-by: Aaron Alvarez <aaarone@amazon.com>
@aalva500-prog aalva500-prog marked this pull request as ready for review September 30, 2025 21:45
@aalva500-prog aalva500-prog changed the title [SQL/PPL] Fix the count(*) and dc(field) to be capped at MAX_INTEGER #4416 [SQL/PPL] Fix the count(*) and dc(field) to be capped at MAX_INTEGER #4416 Sep 30, 2025
@Swiddis Swiddis added bug Something isn't working v3.3.0 labels Sep 30, 2025
@RyanL1997 RyanL1997 added the PPL Piped processing language label Sep 30, 2025
@RyanL1997
Copy link
Collaborator

https://github.com/opensearch-project/sql/actions/runs/18144320253/job/51642545887?pr=4418#step:6:575

Got

SQLBackwardsCompatibilityIT > testBackwardsCompatibility FAILED
    java.lang.AssertionError: 
    Expected: iterable with items [(name=COUNT(*) FILTER(WHERE age > 35), alias=null, type=integer)] in any order
         but: not matched: <{"name":"COUNT(*) FILTER(WHERE age > 35)","type":"long"}>
        at __randomizedtesting.SeedInfo.seed([27E773C284B404C7:CCD84E0CFE6FD14D]:0)
        at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:18)
        at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:6)

        at org.opensearch.sql.util.MatcherUtils.verify(MatcherUtils.java:196)
        at org.opensearch.sql.util.MatcherUtils.verifySchema(MatcherUtils.java:144)
        at org.opensearch.sql.bwc.SQLBackwardsCompatibilityIT.verifySQLQueries(SQLBackwardsCompatibilityIT.java:190)
        at org.opensearch.sql.bwc.SQLBackwardsCompatibilityIT.testBackwardsCompatibility(SQLBackwardsCompatibilityIT.java:126)
  1> [2025-09-30T17:51:35,784][INFO ][o.o.s.b.SQLBackwardsCompatibilityIT] [testBackwardsCompatibility] before test
  1> [2025-09-30T17:51:35,946][INFO ][o.o.s.b.SQLBackwardsCompatibilityIT] [testBackwardsCompatibility] initializing REST clients against [http://[::1]:36159, 

Swiddis
Swiddis previously approved these changes Sep 30, 2025
Copy link
Collaborator

@Swiddis Swiddis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm, assuming tests pass

Signed-off-by: Aaron Alvarez <aaarone@amazon.com>
Signed-off-by: Aaron Alvarez <aaarone@amazon.com>
Signed-off-by: Aaron Alvarez <aaarone@amazon.com>
@LantaoJin LantaoJin added the engine v2 Issues related to v2 query engine only. label Oct 1, 2025
@LantaoJin
Copy link
Member

LantaoJin commented Oct 1, 2025

Just a question: are we still need to fix v2 only bugs? Now the default engine migrate to v3 from 3.3.0, and this fixing seems no plan to backport to 3.1.

@RyanL1997
Copy link
Collaborator

RyanL1997 commented Oct 1, 2025

Just a question: are we still need to fix v2 only bugs? Now the default engine migrate to v3 from 3.3.0, and this fixing seems no plan to backport to 3.1.

Hi @LantaoJin, this is related to a serverless issue I think. Serverless is still on a customized version of 2.17 of sql which is still relying on the legacy engine. cc @aalva500-prog

@aalva500-prog
Copy link
Contributor Author

aalva500-prog commented Oct 1, 2025

Hi @LantaoJin, @RyanL1997 is correct, this issue was reported from serverless side. The latest OS version they use is 2.17 customized. I'm not sure when they plan to migrate to V3, but I don't think it will be in the near future.

@RyanL1997 RyanL1997 merged commit d7b2c35 into opensearch-project:main Oct 1, 2025
37 checks passed
asifabashar added a commit to asifabashar/sql that referenced this pull request Oct 10, 2025
* main-apple: (218 commits)
  Add ignorePrometheus Flag for integTest and docTest (opensearch-project#4442)
  Create fab-radar.yml
  PPL `fillnull` command enhancement (opensearch-project#4421)
  reverting to _doc + _id (opensearch-project#4435)
  Support `multisearch` command in calcite (opensearch-project#4332)
  Add 3.3 release notes (opensearch-project#4422) (opensearch-project#4423)
  [SQL/PPL] Fix the `count(*)` and `dc(field)` to be capped at MAX_INTEGER opensearch-project#4416 (opensearch-project#4418)
  Change the default search sort tiebreaker to `_shard_doc` for PIT search (opensearch-project#4378)
  [Enhancement] Add error handling for known limitation of sql `JOIN` (opensearch-project#4344)
  Bugfix: SQL type mapping for legacy JDBC output (opensearch-project#3613)
  Version bump: 3.3 (opensearch-project#4417)
  Add max/min eval functions (opensearch-project#4333)
  Support time modifiers in search command  (opensearch-project#4224)
  Fix numbered token bug and make it optional output in patterns command (opensearch-project#4402)
  refactor span (opensearch-project#4334)
  Move release notes categories (opensearch-project#3818)
  [Doc] Enable doctest with Calcite (opensearch-project#4379)
  Mod function should return decimal instead of float when handle the operands are decimal literal (opensearch-project#4407)
  Scale of decimal literal should always be positive in Calcite (opensearch-project#4401)
  Enable Calcite by default and implicit fallback the unsupported commands (opensearch-project#4372)
  ...
opensearch-trigger-bot bot pushed a commit that referenced this pull request Oct 24, 2025
…GER #4416 (#4418)

Co-authored-by: Aaron Alvarez <aaarone@amazon.com>
(cherry picked from commit d7b2c35)
Signed-off-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
LantaoJin added a commit that referenced this pull request Oct 28, 2025
…e capped at MAX_INTEGER #4416 (#4656)

* [SQL/PPL] Fix the `count(*)` and `dc(field)` to be capped at MAX_INTEGER #4416 (#4418)

Co-authored-by: Aaron Alvarez <aaarone@amazon.com>
(cherry picked from commit d7b2c35)
Signed-off-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>

* Fix IT

Signed-off-by: Lantao Jin <ltjin@amazon.com>

* Fix IT

Signed-off-by: Lantao Jin <ltjin@amazon.com>

---------

Signed-off-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Signed-off-by: Lantao Jin <ltjin@amazon.com>
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Co-authored-by: Aaron Alvarez <aaarone@amazon.com>
Co-authored-by: Lantao Jin <ltjin@amazon.com>
@aalva500-prog aalva500-prog deleted the feature/fixcount-clean branch January 7, 2026 22:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

backport 2.19-dev bug Something isn't working engine v2 Issues related to v2 query engine only. PPL Piped processing language v3.3.0

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants