Fixed AttentionalDecoder truncating to 20 chars #85

PrinsINT · 2023-10-27T09:44:15Z

When labels for a task are longer than 20 characters, they would be decoded as truncated to 20 characters. So instead we first look at the length of the longest known token.

Before this fix, we would get extremely low test scores for datasets with e.g. long part of speech tags: after all, truncated labels don't match the original ones.

#73 / #74 almost completely rewrites AttentionalDecoder.predict_max() for a speed up. However, it still contains the same bug mentioned here. Notably, our fix here greatly slows down predict_max() for long labels. For example, I went from 2min44s to 11m00s trying to evaluate the same set & model before and after this fix.

Therefore, I'm pinging @PonteIneptique as well, as I'm not sure how this fix would work in the improved predict_max().

Naively inserting the fix at line 344 and removing line 424 throws an error:

Traceback (most recent call last):
  File "/pie/pie/evaluate.py", line 84, in <module>
    run(model_path=args.model_path, test_path=args.test_path,
  File "/pie/pie/evaluate.py", line 56, in run
    for task in model.evaluate(testset, trainset,
  File "/pie/pie/pie/models/base_model.py", line 83, in evaluate
    preds = self.predict(inp, **kwargs)
  File "/pie/pie/pie/models/model.py", line 314, in predict
    hyps, prob = decoder.predict_max(
  File "/pie/pie/pie/models/decoder.py", line 362, in predict_max
    outs, _ = self.attn(outs, enc_outs, lengths)
  File "/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1527, in _call_impl
    return forward_call(*args, **kwargs)
  File "/pie/pie/pie/models/attention.py", line 164, in forward
    mask = mask.unsqueeze(0).expand_as(weights)
RuntimeError: The expanded size of the tensor (20) must match the existing size (43) at non-singleton dimension 2.  Target sizes: [1, 12576, 20].  Tensor sizes: [1, 12576, 43]

Instead, take the length of the longest known token

PonteIneptique · 2023-10-27T09:49:35Z

Hey :)
I continued developing Pie on the PaPie repository. Given #84 (comment), I don't know if @emanjavacas is gonna support Pie more (I'll let him reply :D )
As for the fix, while the issue you note is important, I am a bit puzzled by this sentence in the issue that is fixed:

Before this fix, we would get extremely low test scores for datasets with e.g. long part of speech tags

Normally, PartOfSpeechTags are categorical, not generated. Can you provide your configuration ?

As for the lemmas though, you are right it might be an issue, I'll check that next week (I am going on vacation tonight)

emanjavacas · 2023-10-27T09:55:30Z

Hi both, We could just increase the default max_seq_len to accommodate. However, this decoder is only meant for tasks like lemmatization. It's hard to imagine a language with lemmas longer than 20, but still we could enlarge the default max_seq_len. If you are using this decoder for POS tagging, that's something I cannot see right now the use for it. And yeah POS labels are categorical anyway, you could always map them to integers or use whatever coding scheme. I am not going to commit to great changes in PIE, but I think it's still quite robust and should meet the demands of many. I tested it on pytorch 2.0 and it passes all tests. But I don't want to change it in a way that makes it harder to maintain, which is probably the reason why Thibault (hi Thibault!) forked it.

…

On Fri, Oct 27, 2023 at 11:49 AM Thibault Clérice ***@***.***> wrote: Hey :) I continued developing Pie on the PaPie repository. Given #84 (comment) <#84 (comment)>, I don't know if @emanjavacas <https://github.com/emanjavacas> is gonna support Pie more (I'll let him reply :D ) As for the fix, while the issue you note is important, I am a bit puzzled by this sentence in the issue that is fixed: Before this fix, we would get extremely low test scores for datasets with e.g. long part of speech tags Normally, PartOfSpeechTags are categorical, not generated. Can you provide your configuration ? As for the lemmas though, you are right it might be an issue, I'll check that next week (I am going on vacation tonight) — Reply to this email directly, view it on GitHub <#85 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/ABPIPIYPTPGM2K355CZYEN3YBN7TTAVCNFSM6AAAAAA6SRI7ZOVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTOOBSGYZDKMZZGA> . You are receiving this because you were mentioned.Message ID: ***@***.***>

-- Enrique Manjavacas

PrinsINT · 2023-10-27T10:02:07Z

The config file used had the tasks defined as:

    "tasks": [
        {
            "name": "lemma",
            "target": true,
            "context": "sentence",
            "level": "char",
            "decoder": "attentional",
            "settings": {
                "bos": true,
                "eos": true,
                "lower": true,
                "target": "lemma"
            },
            "layer": -1
        },
        {
            "name": "pos",
            "target": false,
            "context": "sentence",
            "level": "char",
            "decoder": "attentional",
            "settings": {
                "bos": true,
                "eos": true,
                "lower": true,
                "target": "pos"
            },
            "layer": -1
        }
    ],

I'm guessing the 'pos' task should have been defined at token level, not char level? And I suppose then that you should use a linear or crf encoder, instead of attentional?

emanjavacas · 2023-10-27T10:09:24Z

Yup, don't use the "attentional" decoder for POS, use "linear" as shown in the example config for PoS https://github.com/emanjavacas/pie/blob/master/configs/pos.json#L12

…

On Fri, Oct 27, 2023 at 12:02 PM Vincent Prins ***@***.***> wrote: The config file used had the tasks defined as: "tasks": [ { "name": "lemma", "target": true, "context": "sentence", "level": "char", "decoder": "attentional", "settings": { "bos": true, "eos": true, "lower": true, "target": "lemma" }, "layer": -1 }, { "name": "pos", "target": false, "context": "sentence", "level": "char", "decoder": "attentional", "settings": { "bos": true, "eos": true, "lower": true, "target": "pos" }, "layer": -1 } ], I'm guessing the 'pos' task should have been defined at token level, not char level? And I suppose then that you should use a linear or crf encoder, instead of attentional? — Reply to this email directly, view it on GitHub <#85 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/ABPIPI7SOANUCQ7OAV6HYG3YBOBCTAVCNFSM6AAAAAA6SRI7ZOVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTOOBSGY2DEOJTGY> . You are receiving this because you were mentioned.Message ID: ***@***.***>

-- Enrique Manjavacas

Fixed AttentionalDecoder truncating to 20 chars

8727990

Instead, take the length of the longest known token

PrinsINT mentioned this pull request Oct 27, 2023

AttentionalDecoder truncates labels to 20 chars #86

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fixed AttentionalDecoder truncating to 20 chars #85

Fixed AttentionalDecoder truncating to 20 chars #85

Uh oh!

PrinsINT commented Oct 27, 2023

Uh oh!

PonteIneptique commented Oct 27, 2023

Uh oh!

emanjavacas commented Oct 27, 2023 via email

Uh oh!

PrinsINT commented Oct 27, 2023

Uh oh!

emanjavacas commented Oct 27, 2023 via email

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Fixed AttentionalDecoder truncating to 20 chars #85

Are you sure you want to change the base?

Fixed AttentionalDecoder truncating to 20 chars #85

Uh oh!

Conversation

PrinsINT commented Oct 27, 2023

Uh oh!

PonteIneptique commented Oct 27, 2023

Uh oh!

emanjavacas commented Oct 27, 2023 via email

Uh oh!

PrinsINT commented Oct 27, 2023

Uh oh!

emanjavacas commented Oct 27, 2023 via email

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants