-
Notifications
You must be signed in to change notification settings - Fork 1.9k
Description
Describe the bug
I have a query where I JOIN three tables, and then have a WHERE clause on the last table ("twintag.owner"). If I give a value that exists there, all the results are returned (regardless of whether or not the "twintag.owner" matches the actual value), and if I give a value that doesn't exist, it returns nothing.
The query is as follows:
SELECT "75620d04a8b7b34c70c68f7af68953af".*
FROM "75620d04a8b7b34c70c68f7af68953af"
JOIN "f4be58656cd5e54b9bffa1706c3c49ff" AS "linkedtwintag"
ON "75620d04a8b7b34c70c68f7af68953af"."82ad7656c6b6be56e9abfac772c89b4c" = "linkedtwintag"."776ff32c7d7bfb1c1a49d5c0d94e0db8"
JOIN "__bag" AS "linkedtwintag_twintag"
ON "linkedtwintag"."dataScope" = "linkedtwintag_twintag"."storage_qid"
WHERE 1 = 1
AND "linkedtwintag_twintag"."owner" = '324ae4533db15dcb19e3473fea54177a'
ORDER BY "75620d04a8b7b34c70c68f7af68953af"."qid" ASCIf I start from the intermediate table it works as expected and I get back the one result I'm looking for:
SELECT "f4be58656cd5e54b9bffa1706c3c49ff".*
FROM "f4be58656cd5e54b9bffa1706c3c49ff"
JOIN "__bag" AS "twintag"
ON "f4be58656cd5e54b9bffa1706c3c49ff"."dataScope" = "twintag"."storage_qid"
WHERE 1 = 1
AND "twintag"."owner" = '324ae4533db15dcb19e3473fea54177a' ORDER BY "f4be58656cd5e54b9bffa1706c3c49ff"."qid" ASCTo Reproduce
Create some tables where you can construct a JOIN from table A to B and from B to C.
Execute the (type of) query, where you use the intermediate table to join from table A to C, filtering on something in table C:
SELECT "75620d04a8b7b34c70c68f7af68953af".*
FROM "75620d04a8b7b34c70c68f7af68953af"
JOIN "f4be58656cd5e54b9bffa1706c3c49ff" AS "linkedtwintag"
ON "75620d04a8b7b34c70c68f7af68953af"."82ad7656c6b6be56e9abfac772c89b4c" = "linkedtwintag"."776ff32c7d7bfb1c1a49d5c0d94e0db8"
JOIN "__bag" AS "linkedtwintag_twintag"
ON "linkedtwintag"."dataScope" = "linkedtwintag_twintag"."storage_qid"
WHERE 1 = 1
AND "linkedtwintag_twintag"."owner" = '324ae4533db15dcb19e3473fea54177a'
ORDER BY "75620d04a8b7b34c70c68f7af68953af"."qid" ASCExpected behavior
I expect only the items which match the filter on table C to be returned.
Currently it returns either ALL items if an item matches, or NO items if there is no match found. In the example above I expect 1 match, but I get all the items from table A back.
If I change the value I'm looking for in table C to be non-existent there, I get back no results instead, which is expected.
Additional context
I'm also using datafusion-remote-table from seafowl (https://github.com/splitgraph/seafowl/tree/main/datafusion_remote_tables)