-
Notifications
You must be signed in to change notification settings - Fork 10
Fixed AttentionalDecoder truncating to 20 chars #85
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Fixed AttentionalDecoder truncating to 20 chars #85
Conversation
Instead, take the length of the longest known token
|
Hey :)
Normally, PartOfSpeechTags are categorical, not generated. Can you provide your configuration ? As for the lemmas though, you are right it might be an issue, I'll check that next week (I am going on vacation tonight) |
|
Hi both,
We could just increase the default max_seq_len to accommodate. However,
this decoder is only meant for tasks like lemmatization. It's hard to
imagine a language with lemmas longer than 20, but still we could enlarge
the default max_seq_len. If you are using this decoder for POS tagging,
that's something I cannot see right now the use for it. And yeah POS labels
are categorical anyway, you could always map them to integers or use
whatever coding scheme.
I am not going to commit to great changes in PIE, but I think it's still
quite robust and should meet the demands of many. I tested it on pytorch
2.0 and it passes all tests. But I don't want to change it in a way that
makes it harder to maintain, which is probably the reason why Thibault (hi
Thibault!) forked it.
…On Fri, Oct 27, 2023 at 11:49 AM Thibault Clérice ***@***.***> wrote:
Hey :)
I continued developing Pie on the PaPie repository. Given #84 (comment)
<#84 (comment)>, I
don't know if @emanjavacas <https://github.com/emanjavacas> is gonna
support Pie more (I'll let him reply :D )
As for the fix, while the issue you note is important, I am a bit puzzled
by this sentence in the issue that is fixed:
Before this fix, we would get extremely low test scores for datasets with
e.g. long part of speech tags
Normally, PartOfSpeechTags are categorical, not generated. Can you provide
your configuration ?
As for the lemmas though, you are right it might be an issue, I'll check
that next week (I am going on vacation tonight)
—
Reply to this email directly, view it on GitHub
<#85 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/ABPIPIYPTPGM2K355CZYEN3YBN7TTAVCNFSM6AAAAAA6SRI7ZOVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTOOBSGYZDKMZZGA>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
--
Enrique Manjavacas
|
|
The config file used had the tasks defined as: I'm guessing the 'pos' task should have been defined at token level, not char level? And I suppose then that you should use a linear or crf encoder, instead of attentional? |
|
Yup,
don't use the "attentional" decoder for POS, use "linear" as shown in the
example config for PoS
https://github.com/emanjavacas/pie/blob/master/configs/pos.json#L12
…On Fri, Oct 27, 2023 at 12:02 PM Vincent Prins ***@***.***> wrote:
The config file used had the tasks defined as:
"tasks": [
{
"name": "lemma",
"target": true,
"context": "sentence",
"level": "char",
"decoder": "attentional",
"settings": {
"bos": true,
"eos": true,
"lower": true,
"target": "lemma"
},
"layer": -1
},
{
"name": "pos",
"target": false,
"context": "sentence",
"level": "char",
"decoder": "attentional",
"settings": {
"bos": true,
"eos": true,
"lower": true,
"target": "pos"
},
"layer": -1
}
],
I'm guessing the 'pos' task should have been defined at token level, not
char level? And I suppose then that you should use a linear or crf encoder,
instead of attentional?
—
Reply to this email directly, view it on GitHub
<#85 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/ABPIPI7SOANUCQ7OAV6HYG3YBOBCTAVCNFSM6AAAAAA6SRI7ZOVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTOOBSGY2DEOJTGY>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
--
Enrique Manjavacas
|
When labels for a task are longer than 20 characters, they would be decoded as truncated to 20 characters. So instead we first look at the length of the longest known token.
Before this fix, we would get extremely low test scores for datasets with e.g. long part of speech tags: after all, truncated labels don't match the original ones.
#73 / #74 almost completely rewrites
AttentionalDecoder.predict_max()for a speed up. However, it still contains the same bug mentioned here. Notably, our fix here greatly slows downpredict_max()for long labels. For example, I went from 2min44s to 11m00s trying to evaluate the same set & model before and after this fix.Therefore, I'm pinging @PonteIneptique as well, as I'm not sure how this fix would work in the improved
predict_max().Naively inserting the fix at line 344 and removing line 424 throws an error: