remove druid.processing.columnCache.sizeBytes and CachingIndexed, combine string column implementations#14500
Conversation
changes: * generic indexed, front-coded, and auto string columns now all share the same column and index supplier implementations * remove CachingIndexed implementation, which I think is largely no longer needed by the switch of many things to directly using ByteBuffer, avoiding the cost of creating Strings * remove ColumnConfig.columnCacheSizeBytes since CachingIndexed was the only user
|
intellij inspection failure: is incorrect, but maybe it doesn't recognize the anonymous lambda classes of |
gianm
left a comment
There was a problem hiding this comment.
LGTM.
I am ok with removing druid.processing.columnCache.sizeBytes. I searched a few places for that property, both public and private, and did not see evidence that it is widely used. I also agree that with the various efforts to do more ops directly on UTF-8, it isn't as useful as it used to be.
Btw, this reminded me of #11201, a PR of mine that is a couple years old and is still open. I just merged master into it, and would appreciate a review of that PR. It's some work towards HLL sketches working directly on UTF-8.
| this.columnConfig = columnConfig; | ||
| this.numRows = numRows; | ||
| if (frontCodedStringDictionarySupplier != null) { | ||
| this.stringIndexSupplier = new StringUtf8ColumnIndexSupplier<>( |
There was a problem hiding this comment.
Might be clearer if the conditional is only about declaring the dictionary.
There was a problem hiding this comment.
cleaned up both this and the deserializer in DictionaryEncodedColumnPartSerde a bit more.
|
This PR has bad luck with static checks being completely wrong, here is another one: which is saying that should instead be: which .. no. |
|
failing ci check seems to be failing on a few (maybe all?) recent PRs and is unrelated to the changes here |
…bine string column implementations (apache#14500) * combine string column implementations changes: * generic indexed, front-coded, and auto string columns now all share the same column and index supplier implementations * remove CachingIndexed implementation, which I think is largely no longer needed by the switch of many things to directly using ByteBuffer, avoiding the cost of creating Strings * remove ColumnConfig.columnCacheSizeBytes since CachingIndexed was the only user
Description
Follow up to #14142 to clean up some additional stuff.
changes:
CachingIndexedimplementation, which I think is largely no longer needed by the switch of many things to directly usingByteBuffer, avoiding the cost of creatingStringColumnConfig.columnCacheSizeBytes()sinceCachingIndexedwas the only userRelease note
druid.processing.columnCache.sizeByteshas been removed since it provided limited utility after a number of internal changes. Leaving this config is harmless, but it does nothing.This PR has: