Even with the same source (JDBC or CSV file), we should make separate configuration for each pair of different key, value column pairs.
For example, if a lookup DB table has four columns, A, B, C, and D, and I need three lookups A to B, C to D, and A to C, then, I should make three different namespaces, which have lots of redundant information like URI, poll period, ID and so on.
In my thought, this approach is not good for maintenance.
When source configuration is changed like password change and table/column name change, it is hard to check which namespaces are affected by that change and also tiresome to manually change all the related namespaces.
So, I think it is better to divide namespace to two level, namespace and lookup maps.
Namespace is data source level and lookup map is defined for each different (key, value) column or field pairs within the given data source.
For the first example case, changed configuration could be like followings
{
"type":"jdbc",
"namespace":"DB1",
"connectorConfig":{
"createTables":true,
"connectURI":"jdbc:mysql://localhost:3306/druid",
"user":"druid",
"password":"diurd"
},
"table":"some_lookup_table",
"lookup maps": [
{"name": "AtoB",
"key": "A",
"value":"B"},
{"name": "CtoD",
"key": "C",
"value":"D"},
{"name": "AtoC",
"key": "A",
"value":"C"}
]
"tsColumn":"timestamp_column",
"pollPeriod":600000
}
And, NamespacedExtractor may have one more parameter that indicate lookup map name in the given namespace.
Even with the same source (JDBC or CSV file), we should make separate configuration for each pair of different key, value column pairs.
For example, if a lookup DB table has four columns, A, B, C, and D, and I need three lookups A to B, C to D, and A to C, then, I should make three different namespaces, which have lots of redundant information like URI, poll period, ID and so on.
In my thought, this approach is not good for maintenance.
When source configuration is changed like password change and table/column name change, it is hard to check which namespaces are affected by that change and also tiresome to manually change all the related namespaces.
So, I think it is better to divide namespace to two level, namespace and lookup maps.
Namespace is data source level and lookup map is defined for each different (key, value) column or field pairs within the given data source.
For the first example case, changed configuration could be like followings
And, NamespacedExtractor may have one more parameter that indicate lookup map name in the given namespace.