Add invert_channels option to TransformData#1139
Conversation
|
I think it is simpler to do a one time conversion of the VGG models / mean to BGR to keep a standard throughout instead of introducing another configuration switch. I can do the surgery today, or it may be simplest if I send you the script so that you can upload the transformed models yourself and update the model zoo entries. |
|
I agree with @shelhamer. Actually, it would be nice to have a (Python) script which does the conversion, so that it could be used for other models trained on RGB images. |
|
@ksimonyan I will share the script -- it is quite simple to do these conversions since parameters are mutable in the Python wrapper as the editing model parameters example shows. |
|
Sure, I don't mind, I just wanted to give it a try quickly. With net surgery then @ksimonyan would need to update his models and change his Matlab code to switch the channels. @ksimonyan with #501 is also very easy to do Net surgery in Matlab, just in case you want to do it yourself. So far I tried two networks and they work well. These are the errors on the val set using only the center crop (differences with #1138 are probably due to the use of 10 crops per image and images resized to 256x256):
|
|
@sguada, thanks for trying out the models on ILSVRC. I've added the errors, achieved using a single test crop, to the gist: https://gist.github.com/ksimonyan/5c9129cfb8f0359eaf67
So you seem to be losing ~1% somewhere. I used the central crop from the images rescaled so that the smallest side is 256 (while preserving the aspect ratio). So maybe the aspect ratio distortion is the reason for the drop. |
|
@ksimonyan Thanks for sharing your errors with single test crop. |
|
@sguada |
|
I have a question on how did you computed the mean image if images have different sizes and the crops can be in different positions. Or did you just compute the mean of the center crop? |
Yes. |
|
@ksimonyan regarding the mean, have you also tried training with a channel mean averaged over the spatial dimensions (since that simplifies preprocessing for differently-sized inputs)? |
|
@shelhamer |
|
No yet, I did it in private branch, but I could do a PR and share it. |
Please do, it might come in handy. |
|
Sure I will, are you planing in releasing your models for ILSVRC-2014? |
Possibly, but let's sort out the BMVC-2014 models first. |
|
@sguada here's a script to swap channels: https://gist.github.com/shelhamer/bee2a5b2b739fe6cee6f#file-swap_input_channels-py PR your channel-wise mean transformer when you have chance -- like we've talked about the age of the channel mean will be a better one than these spatial mean times. |
|
Thanks to @ksimonyan now the VGG models #1138 are in BGR format, so there is no need for this PR |
Added invert_channels option to TransformData. That would invert the order of the channels in datum and mean to be able to use the models released in #1138 with current level_db/lmdb and image_mean stored in BGR format.
This is a simple fix, what do you think @ksimonyan @shelhamer?
Ex.