Skip to content

Comments

Feature/ignore euclideanloss#5250

Open
matthill wants to merge 3 commits intoBVLC:masterfrom
matthill:feature/ignore-euclideanloss
Open

Feature/ignore euclideanloss#5250
matthill wants to merge 3 commits intoBVLC:masterfrom
matthill:feature/ignore-euclideanloss

Conversation

@matthill
Copy link

@matthill matthill commented Feb 3, 2017

This PR allows me to use an "ignore_label" value in the EuclideanLoss layer. For example:

layer {
  name: "loss_match"
  type: "EuclideanLoss"
  bottom: "convb"
  bottom: "regression_label"
  top: "loss_match"
  loss_param {
    ignore_label: 0
  }
}

The issue I was having was that I am training against multiple loss functions with different data. Part of my dataset is missing. I still want to use the data when I have it, but I do not want to back-propagate when those values are missing. Simply setting the values to zero causes the regression to lean towards the zero values, rather than learning to ignore them.

SoftmaxWithLoss already has this feature, this PR adds parity for the EuclideanLoss layer.

99% of this PR is pulled from this PR: #3677

However, that PR was not ideal, since it simply casts the value to an int and ignores it if it matches. My values are normalized between -1 and 1 so that wouldn't work. Instead, I am simply checking that the float value is within a narrow range around the configured integer. After making this change, my network's accuracy was much higher.

@happyharrycn
Copy link

happyharrycn commented Feb 3, 2017

I do not think this is the best way of ignoring samples for EuclideanLoss (and in general for regression loss). You are also throwing away all samples with a target value close to ignore_label. A more principled way is to implement some sort of masking function. For example, allow the third bottom blob to take a (binary) mask. This mask can be used in backward path to reset/block the gradients.

@matthill
Copy link
Author

matthill commented Feb 4, 2017

That seems like a more complex solution (both in terms of implementation and use) without much benefit. Realistically, how many spurious data points will fall within such an extremely narrow range? I think probably zero or very close to it.

@BlGene
Copy link
Contributor

BlGene commented Feb 17, 2017

@matthill: Can you please run make lint on your PR and fix the mentioned items.

Just to cross reference, this would make #4920 redundant.

@Noiredd
Copy link
Member

Noiredd commented Feb 2, 2018

Closing as a better approach to this has been suggested.

@Noiredd Noiredd closed this Feb 2, 2018
@nnop
Copy link

nnop commented Apr 8, 2018

The suggested approach (by jeff) didn't consider the normalization issue. @Noiredd

@Noiredd
Copy link
Member

Noiredd commented Apr 9, 2018

I see. But wouldn't it be a better way to go if we added a mask input to the EuclideanLossLayer? Loss label in the case of an L2 loss not only feels confusing (we don't even have actual labels here), but also pointless - effectively this means: "never learn values of x", which isn't generally useful. Taking a mask input (i.e. "do not learn from these examples") would be more universal - and it still fulfills @matthill's original reason for implementing this.

I'll reopen this for further conversation.

@Noiredd Noiredd reopened this Apr 9, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants