Skip to content

Comments

Add invert_channels option to TransformData#1139

Closed
sguada wants to merge 1 commit intoBVLC:devfrom
sguada:invert_channels
Closed

Add invert_channels option to TransformData#1139
sguada wants to merge 1 commit intoBVLC:devfrom
sguada:invert_channels

Conversation

@sguada
Copy link
Contributor

@sguada sguada commented Sep 22, 2014

Added invert_channels option to TransformData. That would invert the order of the channels in datum and mean to be able to use the models released in #1138 with current level_db/lmdb and image_mean stored in BGR format.
This is a simple fix, what do you think @ksimonyan @shelhamer?

Ex.

name: "VGG_CNN_F"
layers {
  name: "data"
  type: DATA
  top: "data"
  top: "label"
  data_param {
    source: "examples/imagenet/ilsvrc12_val_leveldb"
    batch_size: 50
  }
  transform_param {
    crop_size: 224
    mean_file: "data/ilsvrc12/imagenet_mean.binaryproto"
    mirror: false
    invert_channels: true
  }
  include: { phase: TEST }
}

@shelhamer
Copy link
Member

I think it is simpler to do a one time conversion of the VGG models / mean to BGR to keep a standard throughout instead of introducing another configuration switch. I can do the surgery today, or it may be simplest if I send you the script so that you can upload the transformed models yourself and update the model zoo entries.

@ksimonyan
Copy link
Contributor

I agree with @shelhamer. Actually, it would be nice to have a (Python) script which does the conversion, so that it could be used for other models trained on RGB images.

@shelhamer
Copy link
Member

@ksimonyan I will share the script -- it is quite simple to do these conversions since parameters are mutable in the Python wrapper as the editing model parameters example shows.

@sguada
Copy link
Contributor Author

sguada commented Sep 22, 2014

Sure, I don't mind, I just wanted to give it a try quickly. With net surgery then @ksimonyan would need to update his models and change his Matlab code to switch the channels.

@ksimonyan with #501 is also very easy to do Net surgery in Matlab, just in case you want to do it yourself.

So far I tried two networks and they work well. These are the errors on the val set using only the center crop (differences with #1138 are probably due to the use of 10 crops per image and images resized to 256x256):

  • VGG_CNN_F: top-1 (42.92%) top-5 (20.4%)
  • VGG_CNN_S: top-1 (38.14%) top-5 (16.73%)

@ksimonyan
Copy link
Contributor

@sguada, thanks for trying out the models on ILSVRC.

I've added the errors, achieved using a single test crop, to the gist: https://gist.github.com/ksimonyan/5c9129cfb8f0359eaf67
They are as follows:

  • CNN_S: 15.4%
  • CNN_M: 15.7%
  • CNN_M_2048: 15.6%
  • CNN_M_1024: 16.0%
  • CNN_M_128: 18.3%
  • CNN_F: 18.9%

So you seem to be losing ~1% somewhere. I used the central crop from the images rescaled so that the smallest side is 256 (while preserving the aspect ratio). So maybe the aspect ratio distortion is the reason for the drop.

@sguada
Copy link
Contributor Author

sguada commented Sep 23, 2014

@ksimonyan Thanks for sharing your errors with single test crop.
As you said it seems that since the model was trained with images preserving the aspect ratio, so when I tested on images with distorted aspect ratio performs ~1.5% worse. (resized to 256x256). I will try to test with images that maintain the aspect ratio.

@ksimonyan
Copy link
Contributor

@sguada
ideally you'll also need a new mean image for that - or you can convert the one which I released (which is stored in mat as a 224x224 RGB image).

@sguada
Copy link
Contributor Author

sguada commented Sep 23, 2014

I have a question on how did you computed the mean image if images have different sizes and the crops can be in different positions. Or did you just compute the mean of the center crop?

@ksimonyan
Copy link
Contributor

did you just compute the mean of the center crop?

Yes.

@shelhamer
Copy link
Member

@ksimonyan regarding the mean, have you also tried training with a channel mean averaged over the spatial dimensions (since that simplifies preprocessing for differently-sized inputs)?

@ksimonyan
Copy link
Contributor

@shelhamer
I did it for the ILSVRC-2014 submission (see http://arxiv.org/abs/1409.1556/), but not for the BMVC models. Is it supported by caffe?

@sguada
Copy link
Contributor Author

sguada commented Sep 23, 2014

No yet, I did it in private branch, but I could do a PR and share it.

@ksimonyan
Copy link
Contributor

No yet, I did it in private branch, so I can do a PR and share it.

Please do, it might come in handy.

@sguada
Copy link
Contributor Author

sguada commented Sep 23, 2014

Sure I will, are you planing in releasing your models for ILSVRC-2014?

@ksimonyan
Copy link
Contributor

are you planing in releasing your models for ILSVRC-2014?

Possibly, but let's sort out the BMVC-2014 models first.

@shelhamer
Copy link
Member

@sguada here's a script to swap channels: https://gist.github.com/shelhamer/bee2a5b2b739fe6cee6f#file-swap_input_channels-py

PR your channel-wise mean transformer when you have chance -- like we've talked about the age of the channel mean will be a better one than these spatial mean times.

@sguada
Copy link
Contributor Author

sguada commented Sep 23, 2014

Thanks to @ksimonyan now the VGG models #1138 are in BGR format, so there is no need for this PR

@sguada sguada closed this Sep 23, 2014
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants