[SPARK-27256][CORE][SQL]If the configuration is used to set the number of bytes, we'd better use bytesConf'.#24187
[SPARK-27256][CORE][SQL]If the configuration is used to set the number of bytes, we'd better use bytesConf'.#2418710110346 wants to merge 1 commit intoapache:masterfrom
bytesConf'.#24187Conversation
|
Test build #103839 has finished for PR 24187 at commit
|
There was a problem hiding this comment.
We need to check if the input value exists in the integer range?
There was a problem hiding this comment.
I'm not sure this is a net helpful change, as the parameter is maxPartitionBytes. I agree it would have been better to call it maxPartitionSize and accept values like "10m". I'm not strongly against it, as existing values would still work.
For other property values without "Bytes", I agree.
There was a problem hiding this comment.
Thanks.
Yeah, the parameter name is a bit confusing, but I think it is not very important whether the parameter name contains "Bytes" or not, I prefer to change it.
There was a problem hiding this comment.
I like this change since both styles (i.e. 1024 and 1k) can be accepted.
|
Test build #103881 has finished for PR 24187 at commit
|
|
retest this please |
|
We could have the same fix below, too? Anyway, have you checked all the related places for the same fix? |
|
Test build #103894 has finished for PR 24187 at commit
|
Thanks. |
|
retest this please |
|
ok, to make other reviewers understood easily, could you list up all the configs that this pr changed in the PR description? |
Ok, thanks. |
There was a problem hiding this comment.
what if we dont' have this long cast?
There was a problem hiding this comment.
If we don't convert to long first , it will encounter exception like this:
Caused by: java.lang.ClassCastException: java.lang.Long cannot be cast to java.lang.Integer
maropu
left a comment
There was a problem hiding this comment.
LGTM and I leave it to other reviwers. cc: @cloud-fan @srowen @dongjoon-hyun
|
Test build #103900 has finished for PR 24187 at commit
|
|
Test build #103909 has finished for PR 24187 at commit
|
|
thanks, merging to master! |
What changes were proposed in this pull request?
Currently, if we want to configure
spark.sql.files.maxPartitionBytesto 256 megabytes, we must setspark.sql.files.maxPartitionBytes=268435456, which is very unfriendly to users.And if we set it like this:
spark.sql.files.maxPartitionBytes=256M, we will encounter this exception:This PR use
bytesConfto replacelongConforintConf, if the configuration is used to set the number of bytes.Configuration change list:
spark.files.maxPartitionBytesspark.files.openCostInBytesspark.shuffle.sort.initialBufferSizespark.shuffle.spill.initialMemoryThresholdspark.sql.autoBroadcastJoinThresholdspark.sql.files.maxPartitionBytesspark.sql.files.openCostInBytesspark.sql.defaultSizeInBytesHow was this patch tested?
1.Existing unit tests
2.Manual testing