Since tags are now stored in the data instead of in the metadata Delta Lake, the process for both ingestion and querying can be optimized to utilize this new structure. Specifically querying can be heavily optimized since we currently reconstruct the tag columns in each field column and then only use the tag columns from the first field column. We also reconstruct all tag columns and not only the requested tag columns.
We should look into ways of optimizing this process so we take advantage of the tags being available in the compressed segments and reconstruct the tag columns in an efficient manner. This could be achieved by creating a more complex plan of what is needed from each grid operator instead of always reconstructing as much as possible.
Since tags are now stored in the data instead of in the metadata Delta Lake, the process for both ingestion and querying can be optimized to utilize this new structure. Specifically querying can be heavily optimized since we currently reconstruct the tag columns in each field column and then only use the tag columns from the first field column. We also reconstruct all tag columns and not only the requested tag columns.
We should look into ways of optimizing this process so we take advantage of the tags being available in the compressed segments and reconstruct the tag columns in an efficient manner. This could be achieved by creating a more complex plan of what is needed from each grid operator instead of always reconstructing as much as possible.