Add more missing arguments#40354
Conversation
0243a3f to
486ed34
Compare
3377862 to
778911a
Compare
978c4a6 to
ca07631
Compare
ca07631 to
0c8add6
Compare
|
|
||
| def __init__(self, config: Cohere2Config, layer_idx: Optional[int] = None): | ||
| super().__init__() | ||
| nn.Module.__init__(self) |
There was a problem hiding this comment.
One question! In some cases, you're replacing super().__init__() with direct calls to the parent class, like here, but in other cases you're doing the opposite, like DeepseekVLImageProcessorFast().__init__(**kwargs) being replaced by super().__init__(**kwargs). I think we generally prefer the super() form unless there's a strong reason not to use it. Why replace super() with nn.Module?
There was a problem hiding this comment.
Because some modulars don't want to call the parent constructor because the members of the parent constructor are immediately rewritten below. That is why some modulars use multiple inheritance with an additional nn.Module to do just the initialization of nn.Module. However, nn.Module is already the gradparent, so why not directly call it?
In other cases, the constructor wants to call the parent constructor but the syntax is wrong.
So I considered two possibilities in each case and decided how to fix...
There was a problem hiding this comment.
Flagging this as a potential modular issue, or something we should give guidance on when writing modular files. cc @ArthurZucker @Cyrilvallez
There was a problem hiding this comment.
I would to add the missing PyLint check after fixing all these to prevent future errors.
There was a problem hiding this comment.
Those are special syntaxes - super() in modular unravels the parent's code, while Parent.__init__ should be replaced by super. I see this is not the case, as modular does not correctly replace Parent.__init__
I will fix it, but until then please hold on to this PR!
There was a problem hiding this comment.
Some changes are in modular files. The changes of converter code may not apply to them.
There was a problem hiding this comment.
Merged the PR that should fix it, feel free to rebase on it! Ping me once it's done so that I can check!
There was a problem hiding this comment.
Basically, you should be able to apply the changes to the modulars (i.e. add the missing self arg and remove redundant parent classes), but the modeling files should not change after applying the converter. If they do and it's not intended, something is still wrong
There was a problem hiding this comment.
@Cyrilvallez There remain some changes. I believe these are missing corner cases in your recent PRs.
There was a problem hiding this comment.
Well I don't see any in your PR... Only the modular are changed
5e59de6 to
5f935bd
Compare
|
@Cyrilvallez Ping |
6ed821c to
611c8b8
Compare
Signed-off-by: cyy <cyyever@outlook.com>
611c8b8 to
b6b59d0
Compare
|
[For maintainers] Suggested jobs to run (before merge) run-slow: bamba, cohere2, d_fine, data2vec, dia, dots1, ernie4_5, evolla, falcon_h1, falcon_mamba, florence2, gemma, gemma2, gemma3, gemma3n, glm4_moe |
What does this PR do?
Follows #40068