-
Notifications
You must be signed in to change notification settings - Fork 8k
Description
Bug Description
Clickhouse Version: 21.3.4.25
JDBC Version: 0.3.1-patch
When executing queries with several ClickhouseExternalData, the server side has a possiblility that the parsed sql (store in system.query_log) is incorrect, which causes either syntax error or missing column exception.
e.g.
Querying about the avg, sum, min, max and any of column Ca with some extra conditions
L0, L1 ... and L9 are ClickhouseExternalData specified on the client side that pass along with the http uri
A correct query would be
select avgIf(Ca, Cb in L0) as avgCa1, sumIf(Ca, Cb in L0) as sumCa1, minIf(Ca, Cb in L0) as minCa1, maxIf(Ca, Cb in L0) as maxCa1, anyIf(Ca, Cb in L0) as anyCa1, avgIf(Ca, Cb in L1) as avgCa2, sumIf(Ca, Cb in L1) as sumCa2, minIf(Ca, Cb in L1) as minCa2, maxIf(Ca, Cb in L1) as maxCa2, anyIf(Ca, Cb in L1) as anyCa2, avgIf(Ca, Cb in L2) as avgCa3, sumIf(Ca, Cb in L2) as sumCa3, minIf(Ca, Cb in L2) as minCa3, maxIf(Ca, Cb in L2) as maxCa3, anyIf(Ca, Cb in L2) as anyCa3, avgIf(Ca, Cb in L3) as avgCa4, sumIf(Ca, Cb in L3) as sumCa4, minIf(Ca, Cb in L3) as minCa4, maxIf(Ca, Cb in L3) as maxCa4, anyIf(Ca, Cb in L3) as anyCa4, avgIf(Ca, Cb in L4) as avgCa5, sumIf(Ca, Cb in L4) as sumCa5, minIf(Ca, Cb in L4) as minCa5, maxIf(Ca, Cb in L4) as maxCa5, anyIf(Ca, Cb in L4) as anyCa5, avgIf(Ca, Cb in L5) as avgCa6, sumIf(Ca, Cb in L5) as sumCa6, minIf(Ca, Cb in L5) as minCa6, maxIf(Ca, Cb in L5) as maxCa6, anyIf(Ca, Cb in L5) as anyCa6, avgIf(Ca, Cb in L6) as avgCa7, sumIf(Ca, Cb in L6) as sumCa7, minIf(Ca, Cb in L6) as minCa7, maxIf(Ca, Cb in L6) as maxCa7, anyIf(Ca, Cb in L6) as anyCa7, avgIf(Ca, Cb in L7) as avgCa8, sumIf(Ca, Cb in L7) as sumCa8, minIf(Ca, Cb in L7) as minCa8, maxIf(Ca, Cb in L7) as maxCa8, anyIf(Ca, Cb in L7) as anyCa8, avgIf(Ca, Cb in L8) as avgCa9, sumIf(Ca, Cb in L8) as sumCa9, minIf(Ca, Cb in L8) as minCa9, maxIf(Ca, Cb in L8) as maxCa9, anyIf(Ca, Cb in L8) as anyCa9, avgIf(Ca, Cb in L9) as avgCa10, sumIf(Ca, Cb in L9) as sumCa10, minIf(Ca, Cb in L9) as minCa10, maxIf(Ca, Cb in L9) as maxCa10, anyIf(Ca, Cb in L9) as anyCa10 from test.test_table where TIMESTAMP >= 1625644740 and TIMESTAMP < 1625644800 and Cc = 'some value'
FORMAT TabSeparatedWithNamesAndTypes
It is supposed to have two lines , whereas the first line is a large query and the second line is added by jdbc.
However, in the query_log I saw queries like
select avgIf(Ca, Cb in L0) as avgCa1, sumIf(Ca, Cb in L0) as sumCa1, minIf(Ca, Cb in L0) as minCa1, maxIf(Ca, Cb in L0) as maxCa1, anyIf(Ca, Cb in L0) as anyCa1, avgIf(Ca, Cb in L1) as avgCa2, sumIf(Ca, Cb in L1) as sumCa2, minIf(Ca, Cb in L1) as minCa2, maxIf(Ca, Cb in L1) as maxCa2, anyIf(Ca, Cb in L1) as anyCa2, avgIf(Ca, Cb in L2) as avgCa3, sumIf(Ca, Cb in L2) as sumCa3, minIf(Ca, Cb in L2) as minCa3, maxIf(Ca, Cb in L2) as maxCa3, anyIf(Ca, Cb in L2) as anyCa3, avgIf(Ca, Cb in L3) as avgCa4, sumIf(Ca, Cb in L3) as sumCa4, minIf(Ca, Cb in L3) as minCa4, maxIf(Ca, Cb in L3) as maxCa4, anyIf(Ca, Cb in L3) as anyCa4, avgIf(Ca, Cb in L4) as avgCa5, sumIf(Ca, Cb in L4) as sumCa5, minIf(Ca, Cb in L4) as minCa5, maxIf(Ca, Cb in L4) as maxCa5, anyIf(Ca, Cb in L4) as anyCa5, avgIf(Ca, Cb in L5) as avgCa6, sumIf(Ca, Cb in L5) as sumCa6, minIf(Ca, Cb in L5) as minCa6, maxIf(Ca, Cb in L5) as maxCa6, anyIf(Ca, Cb in L5) as anyCa6, avgIf(Ca, Cb in L6) as avgCa7, sumIf(Ca, Cb in L6) as sumCa7, minIf(Ca, Cb in L6) as minCa7, maxIf(Ca, Cb in L6) as maxCa7, anyIf(Ca, Cb in L6) as anyCa7, avgIf(Ca, Cb in L7) as avgCa8, sumIf(Ca, Cb in L7) as sumCa8, minIf(Ca, Cb in L7) as minCa8, maxIf(Ca, Cb in L7) as maxCa8, anyIf(Ca, Cb in L7) as anyCa8, avgIf(Ca, Cb in L8) as avgCa9, sumIf(Ca, Cb in L8) as sumCa9, minIf(Ca, Cb in L8) as minCa9, maxIf(Ca, Cb in L8) as maxCa9, anyIf(Ca, Cb in L8) as anyCa9, avgIf(Ca, Cb in L9) as avgCa10, sumIf(Ca, Cb in L9) as sumCa10, minIf(Ca, Cb in L9) as minCa10, maxIf(Ca, Cb in L9) as maxCa10, anyIf(Ca, Cb in L9) as anyCa10 from test.test_table where TIMESTAMP >= 1625646900 and TIMESTAMP < 1625646960 and Cc = 'some value'
TIMESTAMP >= 1625646900 and TIMESTAMP < 1625646960 and Cc = 'some value'
FORMAT TabSeparatedWithNamesAndTypes
and
select avgIf(Ca, DN in L0) as avgCa1, sumIf(Ca, DN in L0) as sumCa1, minIf(Ca, DN in L0) as minCa1, maxIf(Ca, DN in L0) as maxCa1, anyIf(Ca, DN in L0) as anyCa1, avgIf(Ca, DN in L1) as avgCa2, sumIf(Ca, DN in L1) as sumCa2, minIf(Ca, DN in L1) as minCa2, maxIf(Ca, DN in L1) as maxCa2, anyIf(Ca, DN in L1) as anyCa2, avgIf(Ca, DN in L2) as avgCa3, sumIf(Ca, DN in L2) as sumCa3, minIf(Ca, DN in L2) as minCa3, maxIf(Ca, DN in L2) as maxCa3, anyIf(Ca, DN in L2) as anyCa3, avgIf(Ca, DN in L3) as avgCa4, sumIf(Ca, DN in L3) as sumCa4, minIf(Ca, DN in L3) as minCa4, maxIf(Ca, DN in L3) as maxCa4, anyIf(Ca, DN in L3) as anyCa4, avgIf(Ca, DN in L4) as avgCa5, sumIf(Ca, DN in L4) as sumCa5, minIf(Ca, DN in L4) as minCa5, maxIf(Ca, DN in L4) as maxCa5, anyIf(Ca, DN in L4) as anyCa5, avgIf(Ca, DN in L5) as avgCa6, sumIf(Ca, DN in L5) as sumCa6, minIf(Ca, DN in L5) as minCa6, maxIf(Ca, DN in L5) as maxCa6, anyIf(Ca, DN in L5) as anyCa6, avgIf(Ca, DN in L6) as avgCa7, sumIf(Ca, DN in L6) as sumCa7, minIf(Ca, DN in L6) as minCa7, maxIf(Ca, DN in L6) as maxCa7, anyIf(Ca, DN in L6) as anyCa7, avgIf(Ca, DN in L7) as avgCa8, sumIf(Ca, DN in L7) as sumCa8, minIf(Ca, DN in L7) as minCa8, maxIf(Ca, DN in L7) as maxCa8, anyIf(Ca, DN in L7) as anyCa8, avgIf(Ca, DN in L8) as avgCa9, sumIf(Ca, DN in L8) as sumCa9, minIf(Ca, DN in L8) as minCa9, maxIf(Ca, DN in L8) as maxCa9, anyIf(Ca, DN in L8) as anyCa9, avgIf(Ca, DN in L9) as avgCa10, sumIf(Ca, DN in L9) as sumCa10, minIf(Ca, DN in L9) as minCa10, maxIf(Ca, DN in L9) as maxCa10, anyIf(Ca, DN in L9) as anyCa10 from test.DT_NETWORKIF1 where TIMESTAMP >= 1625646840 and TIMESTAMP < 1625646900 and IFNAME = 'eth0'
a, Cb in L3) as avgCa4, sumIf(Ca, Cb in L3) as sumCa4, minIf(Ca, Cb in L3) as minCa4, maxIf(Ca, Cb in L3) as maxCa4, anyIf(Ca, Cb in L3) as anyCa4, avgIf(Ca, Cb in L4) as avgCa5, sumIf(Ca, Cb in L4) as sumCa5, minIf(Ca, Cb in L4) as minCa5, maxIf(Ca, Cb in L4) as maxCa5, anyIf(Ca, Cb in L4) as anyCa5, avgIf(Ca, Cb in L5) as avgCa6, sumIf(Ca, Cb in L5) as sumCa6, minIf(Ca, Cb in L5) as minCa6, maxIf(Ca, Cb in L5) as maxCa6, anyIf(Ca, Cb in L5) as anyCa6, avgIf(Ca, Cb in L6) as avgCa7, sumIf(Ca, Cb in L6) as sumCa7, minIf(Ca, Cb in L6) as minCa7, maxIf(Ca, Cb in L6) as maxCa7, anyIf(Ca, Cb in L6) as anyCa7, avgIf(Ca, Cb in L7) as avgCa8, sumIf(Ca, Cb in L7) as sumCa8, minIf(Ca, Cb in L7) as minCa8, maxIf(Ca, Cb in L7) as maxCa8, anyIf(Ca, Cb in L7) as anyCa8, avgIf(Ca, Cb in L8) as avgCa9, sumIf(Ca, Cb in L8) as sumCa9, minIf(Ca, Cb in L8) as minCa9, maxIf(Ca, Cb in L8) as maxCa9, anyIf(Ca, Cb in L8) as anyCa9, avgIf(Ca, Cb in L9) as avgCa10, sumIf(Ca, Cb in L9) as sumCa10, minIf(Ca, Cb in L9) as minCa10, maxIf(Ca, Cb in L9) as maxCa10, anyIf(Ca, Cb in L9) as anyCa10 from test.test_table where TIMESTAMP >= 1625646840 and TIMESTAMP < 1625646900 and Cc= 'some value'
FORMAT TabSeparatedWithNamesAndTypes
Some parts in the original query was duplicated in the second line and that directly caused many kinds of expcetions. The parsed query stored in system.query_log table differs almost every time I run it (in the case where exception is thrown)
BTW, it worked fine with Clickhouse version 20.8.12.2 and jdbc version 0.3.1-patch. I have checked the client side using debugger mode, and the parsedStmt and httpContext seemed ok to me.
How to reproduce
- creare a table with engine replicatedMergeTree, no need to insert data.
- On the java side, send a query to the server side with aggregate function and externaldata specified (like the example shown above),
- The larger the query size and number of externalData, the higher the possibility that the query throws exception.