We had a lookup like this:
{
"namespace" : "channel_lookup",
"pollPeriod" : "P1D",
"namespaceParseSpec" : {
"keyFieldName" : "key",
"valueFieldName" : "value",
"format" : "customJson"
},
"uri" : "file:/opt/data/druid/channel-lookup.json",
"type" : "uri"
}
In this situation I'd have expected it to match exactly the file /opt/data/druid/channel-lookup.json, but it actually matched a different file named "init" in the same directory. Reading through the code it looks like the file name of a uri is intentionally ignored in favor of searching the parent directory for things matching the versionRegex.
IMO this is pretty surprising behavior (even though the docs do suggest that this can happen).
Could we improve it in one (or all) of these ways?
- don't do the regex-matching-in-parent-dir thing when a specific file is provided and versionRegex = null
- when doing the regex-matching-in-parent-dir thing, only match files if they match the versionRegex and have the provided uri as a prefix (instead of only matching the versionRegex)
- include an example in the docs of how to use a specific file for a lookup
- describe the regex-matching-in-dir process front and center under "URI namespace update" rather than only under the section for versionRegex.
We had a lookup like this:
{ "namespace" : "channel_lookup", "pollPeriod" : "P1D", "namespaceParseSpec" : { "keyFieldName" : "key", "valueFieldName" : "value", "format" : "customJson" }, "uri" : "file:/opt/data/druid/channel-lookup.json", "type" : "uri" }In this situation I'd have expected it to match exactly the file
/opt/data/druid/channel-lookup.json, but it actually matched a different file named "init" in the same directory. Reading through the code it looks like the file name of a uri is intentionally ignored in favor of searching the parent directory for things matching theversionRegex.IMO this is pretty surprising behavior (even though the docs do suggest that this can happen).
Could we improve it in one (or all) of these ways?